Main Content Area
Scale-related Pet-Peeves
Blog #22
Ignoring Dimensionality
If you are using a scale to measure one thing, then it should measure just that one thing. But, how confident are you that it is measuring just that one thing rather than being a mash-up of several things? This is what is meant by dimensionality. It drives me crazy when I come across scales that, on the surface, appear to lack unidimensionality. While scholars are probably more sensitive about dimensionality than industry researchers, neglect of the issue exists even in the best of academic journals.
If authors of published studies do not provide evidence of unidimensionality, readers can only guess if the items in a summated scale are measuring the same thing. In those cases where I seriously doubt that items are measuring the same thing, I note it in my reviews. It may tick off authors when I question the unidimensionality of their scales, but it’s their responsibility to not only test it but to report the results in their publications. If they do not provide evidence in support of unidimensionality then readers are justified in questioning the scale’s quality.
Some may say, “what’s the big deal? I am an experienced researcher and if I say some items in a scale measure the same thing, isn’t that good enough?” For scholars, the answer is clearly “NO, that is not good enough.” For practitioners, I’ll be more generous and say the answer depends upon what you want to do with your results. For example, if you are merely curious about what customers think of your brand and you do not plan to analyze the relationship between that attitude and something else such as purchase intention, then I suppose dimensionality may not be a high priority. But, what if you wanted to measure the portion of your target market that had positive attitudes about a new product your company has introduced and the portion that planned to buy it soon? Attitudes and intentions are two different constructs and that means two different measures are necessary in order to adequately test the relationship/difference between the two.
For those who do not know how to measure dimensionality, I’ll give a few of the typical suggestions. The easiest thing you can do is closely inspect the items and ask if more than one construct is being tapped into by the set of items. While that might be easy to do, it isn’t considered a very rigorous test for two reasons. First, it is not what you think that matters as much as it is finding out how the people in the target market interpret the items. Second, opinions expressed by a few people from the target market in something like a focus group setting is not adequate “testing.” To properly determine how many constructs are being measured by a set of items and how much they measure one construct verses another, you need to use factor analysis. When you use exploratory factor analysis (EFA), you can see if the items that are supposed to measure the same construct have “high” loadings on the same factor. If they do not then you have reason to believe the set of items you thought measured one construct are actually measuring more than one. A stronger test is to simultaneously factor analyze the sets of items that you think measure several constructs. I have found that when you just factor analyze items you think measure one construct it does not provide another factor for the items to load with and you may get misleading results. For example, if you had five items that appeared to measure brand attitude and one that measured something related to it such as purchase intention, it is possible in a EFA for the intention item to load with the other ones. However, if you ran EFA on five attitude items and five intention items at the same time, the results are very likely to show there are two factors. That is why when I see a study that has many items from which many scales are developed and the authors have not run a EFA with all of the items simultaneously, I am very suspicious.
Of course, there are even more powerful ways to test dimensionality, e.g., confirmatory factor analysis (CFA). If you are not familiar with this, suffice it to say here that it requires the researcher to hypothesize a priori the number of factors, whether or not the factors are correlated, and which items will load on which factors. Thus, even with CFA, researchers should test the items for several scales simultaneously.
One measure that should not be used to test for unidimensionality is a measure of reliability. Plenty of articles have been written about this problem but it seems that many scale users still have not gotten the message. Here it is: even if a scale seems to be reliable because of a “good” Cronbach’s alpha, that does not provide sufficient evidence that the items are measuring the same thing. It just provides evidence that the items are correlated and are measuring related things. Having said that, I admit that the opposite can be informative. In other words, when I see a scale with a low alpha, like below 0.70, I strongly suspect that the items are not unidimensional. That suspicion is fueled even further if no evidence is provided in support of the scale’s unidimensionality.
The bottom line is that as much as I advocate the use of multi-item measurement of psychological constructs, that practice calls for researchers to ensure that those multiple items are truly measuring the same thing. Just having one person eye-ball a set of items and conclude they measure the same thing is unacceptable to those who care about precise measurement of things upon which important decisions will be made.