Main Content Area
Don't Use Reverse-coded Items in Scales
There used to a rule-of-thumb in scale development that indicated it was appropriate and even preferable to have reverse-coded items in most summated measures. This was taught because there was concern that in scales with lots of items, respondents might lose motivation as they read many items that seem to be very similar and, consequently, zip through the items checking off a certain level of response. To minimize the problems of inattention and acquiescence, the suggestion was that some items should be included in scales that were stated in the opposite direction.
Despite the potential benefit(s), however, researchers have noted for years the problems that were caused by that practice. One of the problems is that reverse-coded items frequently produce unexpected factor structures (e.g., Netemeyer, Bearden, and Sharma 2003). My own experience shows that items stated in the opposite direction from the others tend to load on another factor, an undesirable characteristic of scales that are supposed to be unidimensional. Another problem when making a scale composed of items with opposing meanings is miscomprehension (Swain, Weathers, and Niedrich 2008). It is easy for respondents to misinterpret phrases that include negation, e.g., the store does not provide low-quality service. These problems are compounded when scales are translated for use in other languages (Wong, Rindfleisch, and Burroughs 2003).
The good news is that there are other ways to handle the issue of response patterns without having to resort to using reverse-coded items. For example, short scales probably do not motivate respondents to become fatigued and to mindlessly provide the same response for all items. This touches on a related issue I will deal with separately which has to do with the “right” number of items needed to measure unidimensional constructs. Suffice it to say for now that 3-8 items are likely to be sufficient (Bagozzi and Baumgartner 1994; Green and Rao 1970). Another way to reduce response styles is to present a “random” mix of items from several scales. In other words, most studies examine more than one construct, thus, the questionnaires have items for more than one scale. If possible, mix those items together. The effect should be that respondents don't easily see a pattern, aren't as complacent, and the result is that they have the necessary motivation to answer more honestly. Admittedly, this works best with items that use the same response format. For example, if you have three scales that all use a five-point Likert-type response scale, then they could be mixed together. It is more difficult to mix a set of items that have different formats and it could even be confusing to participants. For example, imagine that your survey has a semantic differential scale with several items (e.g., the product is attractive / the product is ugly), another set of items that measure the frequency of a behavior (e.g., I send less than 10 text messages a week / I send over 100 text messages a week), and several Likert-type items regarding some attitude (agree/disagree). In that case, you would only mix together those items with the same response formats.
The bottom line is that the tide has turned for using reverse-coded items in multi-item scales that are expected to be unidimensional. They have always had problems and the benefits they may have had can be achieved in other ways.
References:Bagozzi, Richard P. and Hans Baumgartner (1994), “The Evaluation of Structural Equation Models and Hypothesis Testing,” in R.P. Bagozzi, editor, Principles of Marketing Research, Blackwell Publishers, Cambridge, MA (1994), pp. 386-422.
Green, Paul E. and Vithala R. Rao (1970), "Rating Scales and Information Recovery: How Many Scales and Response Categories to Use?" Journal of Marketing, 34 (July), 33-39.
Netemeyer, Richard G., William O. Bearden, and Subhash Sharma (2003), Scaling Procedures: Issues and Applications, Newbury Park, CA: Sage Publications, Inc.
Swain, Scott D, Danny Weathers, and Ronald W. Niedrich (2008), "Assessing Three Sources of Misresponse to Reversed Likert Items," Journal of Marketing Research, 45 (1), 116-131.
Wong, Nancy, Aric Rindfleisch, and James E. Burroughs (2003), "Do Reverse-Worded Items Confound Measures in Cross-Cultural Consumer Research? The Case of the Material Values Scale," Journal of Consumer Research, 30 (June), 72-91.