Main Content Area
Scale-related Pet-Peeves
Blog #9 (by Cheryl Jarvis, jarvisc@fau.edu)
Confusing Reflective & Formative Scales - Part 3
The first and most important step is to spend the time and effort to clearly consider and articulate exactly what we are planning to measure BEFORE constructing a questionnaire or survey. Developing a measurement model is a theoretical exercise requiring every bit of rigor as that needed for developing hypotheses. We first must establish very clear conceptual definitions of our constructs and understand how we are going to measure and specify the constructs before collecting any data. It is critical that the conceptualization and operationalization of a construct match – if it is conceptualized formatively, then model it formatively, and vice versa.
We do not argue that any specific construct should be measured either formatively or reflectively – it depends completely on the conceptualization the author is using. For example, if “job satisfaction” is conceptualized as an overall positive attitude toward a job, then it should be measured reflectively with items such as “I like my job,” “I’m happy in my work,” and “I am unlikely to want to leave this position.” However, if “job satisfaction” is conceptualized as a composite of attitudes toward various components of the job – such as satisfaction with pay, hours, supervisor, coworkers, benefits, responsibilities, etc. - then it should be measured formatively with items such as “I am satisfied with my pay,” “I have a good boss,” and “My work hours are ideal.”
The basic principle is that the operationalization must conceptually match the construct definition – a researcher shouldn’t conceptualize a construct as one thing, and then measure something completely different. As my colleagues and I demonstrated in our review of the literature, mismatches between theory and measurement are all too frequent in the marketing research literature. And as a result of such misspecification, construct validity is at risk and managers may draw the wrong conclusions from the results – rejecting true hypotheses, accepting false ones, or miscalculating the magnitude or even the causal direction of the relationships between variables. For example, if a manager is testing the impact of job satisfaction on employee retention, misspecifying the measurement of job satisfaction could mislead the manager into thinking the effect is stronger, weaker, or even in the opposite direction than it is in reality, resulting in inappropriate and potentially disastrous decisions.
If a researcher is unsure of whether or not a construct should be measured formatively or reflectively, my colleagues and I have provided a set of “decision rules” along with guiding questions that can be answered to help make the determination. These decision rules are explained in detail in our articles, listed below. I should note that it is possible that researchers may have difficulty in answering some of the questions in the decision rules, or some of the answers may be contradictory, but that usually is because the construct has not been adequately defined. Further refinement of the conceptualization may be needed – that is, better clarification of the construct’s domain, evaluating whether all the indicators are within that domain, and considering the measures’ relationships to other constructs. I’ve seen examples where a researcher inadvertently had mixed both formative and reflective indicators within the same set of scale items for a single construct, which to me indicates a lack of clear conceptualization of exactly what the researcher intended to measure.
Note that the decision as to whether a construct should be measured reflectively or formatively should never be an “empirical question.” Sadly I’ve often reviewed manuscripts in which authors “left it up to the fit statistics” to decide whether a formative or reflective model was most appropriate. Such an approach is simply incorrect. As has been demonstrated, misspecifying a formative measurement model as reflective would result in fit statistics that are biased and untrustworthy – so how could we rely on them to make the decision for us? The choice is a theoretical one, not an empirical one. Only clear thought and proper construct definitions can make the determination. Don’t fall back on faulty statistics to do the hard work for you.
Making specification decisions prior to designing the questionnaire is critical, because waiting until after the data is collected to worry about analyzing measurement data can be fatal for the project. A formatively-indicated construct requires special item construction in order to be mathematically identified within the model. If a researcher has failed to collect the types of measures needed to achieve identification, then there is no way to save the analysis after-the-fact. The data will have to be thrown away and more collected. More information on achieving identification also can be found in my articles, listed below.
Some researchers mistakenly assume that one solution to the challenge of modeling formative constructs is to simply use summed or averaged “scale scores.” However, this is not an acceptable solution. Scale scores are problematic for both reflective and formative constructs when used in structural equation modeling, because they ignore measurement error (which is the defining characteristic of SEM). Using scale scores thus results in inconsistent and biased results for hypothesis tests, because of the failure to account for measurement error. It is not necessary to trade the “formative problem” for the problems created by the use of scale scores in SEM. The solution is simply to model the formative indicators correctly, which is not difficult with many modern software packages.
I’ve come across many researchers who mistakenly believe that the only way to analyze a model with formative indicators is by using the PLS (partial least squares) software package. However, this is certainly not the case. Any maximum-likelihood SEM software can handle formative indicators. Modeling formative indicators is no more difficult in SEM software packages than it is in the PLS program, and both approaches require the same consideration of theoretical approach and analytical identification. AMOS and EQS are two SEM packages particularly well-adapted to formative specifications. By using SEM with maximum likelihood rather than resorting to PLS, the researcher can retain the measurement error that is both the defining characteristic of SEM and the source of its advantage over the path analysis approach.