Main Content Area
Assumption of Perpetual Validity
In my routine work of reviewing scales, I see statements made by authors of scholarly journal articles that raise red flags. I have written about several of the problems in past posts. The focus of this post has to do with authors borrowing a scale from past research and implying that they do not need to show evidence of its psychometric quality in their study because it has already been "validated." It is as if to say, the scale is beyond reproach and has reached a holy status. I don’t buy that! Here are some of my reasons.
- The scale is long. As discussed in previous posts, my observation has been that the more items there are in a scale, the more likely it lacks unidimensionality. If a scale lacks unidimensionality, it isn’t clear what a score means because it represents a combination of factors. I am amazed how many older, long scales have been found in one or more studies to be multi-dimensional yet they are revered enough that researchers state (or imply) that no examination of psychometric quality is called for. Admittedly, if there are multiple factors in a scale, it does not automatically mean it lacks validity, but it does mean that work is needed to show that the multiple dimensions are first- order factors that load on a common second order factor.
- Only limited testing has been conducted. There are various forms of validity and I won’t go into all that here. But, suffice it to say that if a study has merely provided evidence of a scale’s unidimensionality and/or reliability then that is the beginning of the process, but it is does not amount to validating a scale. Even if there is some evidence of one form of validity, subsequent users of the scale should be cautious about claiming it is validated and/or there is no need for further testing.
- The scale was examined with one segment of the population. Even with the best of academic research, it is typical for some rigorous tests of validity to be performed only with a convenience sample such as college students. Doing rigorous testing with one group is fine as a start but later users of the scale should be cautious about viewing the scale as having been validated. Related to that, if multiple tests are conducted with samples that are representative of a culture’s population then when the scale is used in another culture, particularly if there is another language involved, it is improper for researchers to assume there is no need for them to conduct validity tests again. That leads to the next point.
- The scale has been adapted. If we allow that a scale has, indeed, received considerable support for its validity in multiple past studies, another problem arises when later users of the scale change it in some way. Items may be rephrased, response anchors may be changed, items may be added or dropped, or the scale may be translated into another language. Even though there may be no reason to believe the scale’s psychometric quality has diminished, tests should be conducted anyway, especially when the measure is to be used in theory testing. It should not be assumed that it is still valid if it has been changed!
Having said all of that, let me state clearly that I do not expect every scale to be thoroughly examined for its quality and for that to be reported in every publication. A good topic for discussion is the circumstances that justify validity testing, particularly if the study is to be published in a top scholarly journal. What I am saying here is that claiming a scale is valid based on dated or limited testing conducted elsewhere (external to the article) is unacceptable and should be rejected on its face. Validation is a process, and it is best to view all scales as being in near constant need of validation (e.g., Cronbach 1971; Peter 1981). That is why I try to be careful when making statements in my reviews about validity. With the best of scales, when authors have used multiple studies to test many aspects of psychometric quality, I usually say something like “the authors have provided considerable support for several types of validity.” What I will not say, or at least I try to avoid saying, is that a scale “has been validated.”
The bottom line is that researchers should not use the phrase “the scale has been validated” as a way to get out of doing some validity testing themselves. No scale is holy and perpetually beyond reproach!
Cronbach, Lee J. (1971), “Test Validation,” in Educational Measurement, R. L. Thorndike, ed. Washington, D.C.: American Council on Education, 443-507.