Editor’s note: Rajan Sambandam is vice president of research at TRC, a Fort Washington, Pa., research firm.
A considerable amount of customer satisfaction data is collected using tracking studies. While continuity is particularly important in tracking studies, sometimes situations arise where major changes are necessitated. One such change is moving from one scale (say, five-point) to another scale (say, 10-point). There could be many reasons for that kind of change, but it raises the obvious question: How can data collected using the two scales be compared? There are two possible ways of approaching this problem: scale equivalence and re-scaling. In both approaches the objective is to aid the researcher in comparing data that are measured in different ways and make informed decisions. In this article we take a deeper look at these two approaches to scale conversions. The underlying assumption here is that the scale wording is sufficiently comparable that scale conversions can be attempted.
• Scale equivalence
In this approach no attempt is made to modify the data in any way. Instead the focus is on identifying the appropriate way of reporting that would enable scores to become comparable. This wouldn’t be applicable in all situations and is primarily useful in situations where “boxed” scores (top-two box, top-three box, etc.) are reported.
Consider four scales (in terms of scale points) that are used commonly in marketing research: five-point, seven-point, 10-point and 11-point scales. Often the results of a study using these kinds of scales are reported using boxed scores. Questions then relate to how a study using a five-point scale and reporting on “top-two box” scores can be translated when the new scale has, say, seven points. In the approach of scale equivalence we look at the proportion of a scale each scale point covers.
For example, each scale point on a five-point scale covers 20 percent of the scale. That is, if we were generating completely random data to respond to this scale, we would expect approximately 20 percent of the responses to be 1, 20 percent to be 2 and so on. Therefore, a top-two box score would cover 40 percent of the scale points on a five-point scale. Similarly, for a seven-point scale, each scale point accounts for approximately 14 percent of the scale and top-two box scores would account for about 28 percent of the scale points. Table 1 shows the box score distributions for the four scales.
The boxed numbers show that, for example, a top-two box score on a five-point scale accounts for approximately the same proportion of the scale as a top-three box score on a seven-point scale, or top-four box score on a 10-point scale (approximately 40 percent). Hence, when data using these scales are to be compared, the relevant number of top boxes could be used. More generally, Table 2 provides (approximate) conversions for boxed scores among the four scales. A “?” indicates that a simple conversion is not available.
Scale equivalence lets us compare results from different scales without altering the data in any way. While this may be sufficient in some cases it is clearly a limited solution, since only boxed scores can be dealt with in this manner.
• Re-scaling
The basic idea in re-scaling is to alter the scale in such a way that the two scales in question can be directly compared. Such an alteration should enable not just boxed-score reporting, but also mean score reporting. It should be noted that re-scaling relates only to modifying the scales for aggregate reporting purposes and not changing data at the individual respondent level.
Re-scaling can be best demonstrated with real data. Toward this end, a split sample experiment was run using a consumer Web panel to study five-point and 10-point scale conversions. In this experiment, respondents were asked to rate how satisfied they were with their primary bank. A random half of the respondents answered the question using five-point scale, and the remaining half answered on a 10-point scale. Such a split sample design is useful for one main reason: once each scale is converted, it can be compared with the actual scale from the other half to investigate the effects of the conversion.
A total of 223 respondents used a 1-10 scale anchored by Very Dissatisfied and Very Satisfied, while a total of 197 respondents used a 1-5 scale again anchored by Very Dissatisfied and Very Satisfied. Converting the 10-point scale to a new five-point scale is straightforward, since two scale points at a time can be compressed into one. So for example, ratings of 10 and 9 can be converted to 5, ratings of 8 and 7 can be converted to 4 and so on. When converting from a 10-point scale to a new five-point scale, this seems to be the simplest and most reasonable way of doing it.
Converting the five-point scale to a new 10-point scale is somewhat more complicated because we are going from a situation with less information to one with more information. One relatively straightforward way to do the conversion is to simply multiply every scale point by two. In this case the resulting new 10-point scale will have only the five (even-numbered) scale points. Both conversions are shown in Table 3 for the data on hand.
Looking at the distribution of the data, a few points can be made:
— When converting to a new five-point scale and then comparing to the original five-point scale in the other half of the sample, it appears that essentially the same distribution is retained.
— When converting to a new 10-point scale, the distribution is choppy since only five scale points have values associated with them.
— However, if we were only concerned about boxed scores, this would not be so bad since those are similar to scores from the original 10-point scale.
Going beyond the distribution of the data, the mean and standard deviation of each scale was also calculated, as shown in Table 4.
Clearly, the mean and standard deviation scores for the five-point scales are very similar, indicating that the conversion from 10-point to five-point is successful. But converting from the five-point to the new “10-point” scale appears to be overestimating the mean (8.24 compared to 7.80) and underestimating the standard deviation (1.93 to 2.16). The mean is being overestimated because, for every two scale points, we are using only the higher of the two, i.e., between 10 and 9, only 10 is being used. Similarly since only half the scale points are being used, the standard deviation is being underestimated.
• Alternative re-scaling method
Rather than simply multiplying each scale point by two to convert a five-point scale to a new 10-point scale, we could take a more complicated approach. In this method, data from each scale point would be allocated to two scale points in proportion. For example, the 43 percent of the respondents who gave a 5 on the original five-point scale will be distributed between the 10 and 9 scale points of the new 10-point scale.
On what basis would the proportional distribution be made? Randomly assigning half the respondents to 10 and half to 9 would make sense if no other information were available. But other information is available in the form of the original 10-point scale in the other half of the sample. Does it make sense to use this information?
It does, if we can make the assumption that people who gave a particular rating on a five-point scale will stay around that part of the scale, even if they had been presented with a 10-point scale. That is, if someone gave a 5 on a five-point scale the assumption is that that person most likely would have given a 10 or 9 on a 10-point scale. Similarly a 4 would be either an 8 or 7 on the 10-point scale. Of course, it is possible that a person who gave a 4 on a 5-point scale could give a 9 on a 10-point scale, but the results wouldn’t change dramatically because of that.
Table 5 shows the original tables, with the new 10-point scale calculated using the proportional re-distribution method. Thus for example, the 43 percent who gave a 5 on the original five-point scale have now been split such that 30 percent have a 10 rating and 13 percent have a 9 rating on the new 10-point scale.
Both in terms of the distribution and the summary statistics (shown in Table 6), the proportional re-distribution system does a much better job of mimicking the original 10-point scale. Of course, we have aided the process by using the original 10-point scale distribution as the template for redistribution. But if the conversion is used for a tracking study, it would not be a problem since the distributional pattern of the data usually tends to be stable over time.
How does one get a scale to use as a template? Consider the case where a tracking study conducted on a five-point scale for many quarters is converted to a 10-point scale starting this quarter. In order to make effective comparisons, data from previous waves would need to be converted to a 10-point scale. For that purpose, the distribution of the 10-point scaled data from the current quarter can be used to proportionally redistribute the previous waves’ data. It is not an ideal solution because one has to assume that the same distribution from the current 10-point scale would have appeared in the previous waves if such data had been collected. But there don’t seem to be ideal solutions when it comes to scale conversions.
This experiment considered only five-point and 10-point scales. The main conclusion, expectedly, is that reducing scale points is easier than increasing it. If there is a need to increase scale points, then the presence of a template provides a much better solution since it allows the use of proportional re-distribution.
Converting to and from scales that do not differ by integer multipliers (say, five-point to seven-point or seven-point to 10-point) is a more difficult task. When going from a larger scale to a smaller scale it is a bit easier, but even then decisions will have to be made regarding folding of multiple scale points into single scale points on the new scale. For example, when going from a 10-point scale to a seven-point scale, the end points and midpoints may need to be rolled into single scale points. When going from a seven-point scale to a 10-point scale a template would be needed to achieve the proper distributions. Of course, when making such a seven-to-10 conversion one always has the option of just multiplying every scale point by 1.43. But this would be equivalent to multiplying every scale point by two in a five-to-10 conversion and hence the disadvantages mentioned there would apply.
Difficult task
Scale conversion in tracking studies is not something that should be undertaken unless absolutely necessary. At times, however, it needs to be done and we as researchers are left with the difficult task of determining a practical course of action. Some conversions are relatively easier than others, but there are really no perfect conversions. It is our hope that this article provides some guidance on how best to achieve conversions while maintaining reasonable trending over time.