siu logo

office of scale research

Department of marketing


Main Content Area

Scale-related Pet-Peeves

Blog #7 (by Cheryl Jarvis,


Confusing Reflective & Formative Scales - Part 1

It’s been almost three decades since the marketing academic literature recognized that latent variables measured with multi-item scales such as those used in structural equation modeling potentially can take two different forms – both the reflectively-indicated measurement model common to classical test theory, and the formatively-indicated (or composite latent variable) measurement model (e.g. Bagozzi 1981; Fornell and Bookstein 1982).

Yet, when two colleagues and I reviewed all 1,192 latent variable measured with multiple items that had been published in the top four journals in the marketing literature between 1977 and 2000 (Journal of Marketing, Journal of Marketing Research, Journal of Consumer Research, and Marketing Science), we found that almost 30 percent of constructs were misspecified (Jarvis, MacKenzie and Podsakoff 2003). The vast majority of those misspecified constructs were ones that authors had conceptualized and defined as formative, but then incorrectly modeled as reflective. Even though it has been several years since those reviews were published, I frequently see this same mistake in scholarly manuscripts I am asked to review for journals in our field.

So, in this first of three posts on the topic I will explain the distinction between these two measurement models. The second post will identify the costly empirical consequences of misspecifying a formatively-indicated construct as reflective. The third will offer suggestions for how to avoid this mistake and successfully use formative indicators.

Classical test theory assumes that the variation in the scores on measures of a construct is a function of the true score plus error. Thus, the underlying latent variable “causes” the observed variation in the measures, and when the model is pictured the arrows representing the regression parameters are drawn as emanating from the construct to its measures. In this case, the measures are expected to be highly correlated, thus, internal consistency reliability is quite important. The scale items used to measure the construct are assumed to be sampled from the population of potential measures, such that they have similar content (that is, they are unidimensional) and they are functionally interchangeable. Therefore, dropping an indicator from the measurement model of a reflectively-measured latent variable does not alter the meaning of the construct. Typical examples of appropriate applications of the reflective-indicator model include constructs such as attitudes (good-bad, like-dislike, favorable-unfavorable) or purchase intentions (likely-unlikely, probable-improbable, possible-impossible). Note how dropping any one item should not change what it is we are measuring.

This assumed direction of causality is conceptually appropriate in many cases, but not all. For some constructs, the reverse causal direction makes more sense – that is, causality flows from the measures to the construct. These constructs are often referred to as “composite” latent variables, because the scale often is a multidimensional “composite” of measures representing a collection of various behaviors or concepts, and those measures need not be interchangeable. Thus, in a formatively-indicated construct, the measures do not necessarily need to be correlated, although they may be.

For example, a researcher who is measuring the personal sales contacts a customer receives from an insurance agent may ask consumers to respond to semantic differential scales measuring the frequency of being contacted by their agent for the following reasons: “to increase the amount of my current life insurance policy,” “to describe new types of life insurance policies that have become available,” “to encourage me to keep my current life insurance policy in place as it stands.” These items may actually be mutually exclusive – an agent might encourage a customer to keep the current policy, or try to sell the customer more of the same policy, or try to sell the customer a different policy to replace it, but the agent would not try to do all three! Another common type of formative construct is one in which items may be correlated, but do not necessarily have to be. For example, a research may want to measure consumer belief structures about products (e.g. for toothpaste, measuring perceptions of a brands’ relative price, taste, decay prevention, breath freshening); or evaluate salesperson performance on measures such as net profit, volume, new business development, customer retention, customer satisfaction. In these cases, it is possible that the focal object being evaluated may rate highly on one or more attributes, but not on others.

The fact that formative measures need not be correlated means that measures of internal consistency simply are not relevant for these constructs, and in fact could even be misleading and damaging to scale validity. When a construct is formative, the measures represent a census of all concepts that form the construct, and all relevant dimensions must be included in the measures for the scale to validly represent the underlying construct. Therefore, the consequences of dropping one indicator can be quite serious because doing so could omit a unique part of the construct and change the meaning of the variable. If a researcher inappropriately applies a statistic such as Cronbach’s alpha to evaluate the internal consistency of a set of formative measures, and as a result deletes items that are not highly correlated with others in the scale as would be recommended for a reflective construct, then the construct validity of that formative scale would be undermined. However, measures of item reliability that are not dependent on internal consistency – for example, test-retest reliability – can appropriately be used with formative scales.

The big questions regarding this distinction, however, are “so what?” and “what can we do about it?” I will address those questions in the next part of this series later this summer.


Bagozzi, Richard P. (1980), Causal Models in Marketing, New York: Wiley.
Fornell, Claus and Fred L. Bookstein (1982), “Two Structural Equation Models: LISREL and PLS Applied to Consumer Exit-Voice Theory,” Journal of Marketing Research, 19 (November), 440-452.

Fornell, Claes and Fred L. Bookstein (1982) "Two Structural Equation Models:
LISREL and PLS Applied to Consumer Exit-Voice Theory," Journal of Marketing Research, 19 (4), 440-52.

Jarvis, Cheryl Burke, Scott B. MacKenzie and Philip M. Podsakoff (2003), “A Critical Review of Construct Indicators and Measurement Model Misspecification in Marketing and Consumer Research,” Journal of Consumer Research, September (30), 199-218.