Do harder questions produce more useful answers?
Editor’s note: Randy Brooks is president of Directions Research Inc., Cincinnati.
The usage of rating scales in one form or another consumes a considerable proportion of the work of the average marketing research professional. What I am about to describe will challenge your thinking about the utility of much of that effort. It will lay out a promising alternative for you to consider.
My hope for this article is that, after reading it, every reader will rethink the method they use for measuring consumer attitudes - methods most of us have employed from our first days in the business.
Bewildering array
For the past few years, the Internet data-quality controversy has galvanized the survey research industry. The topic rose to prominence following a 2006 conference on respondent cooperation hosted in Chicago by the IIR and Bob Lederer of RFL Communications, during which several research executives raised questions about accepted industry practices. Over the next three to four years a bewildering array of articles were written, speeches made and other conferences held to examine online data collection and to hold it to very high standards of accuracy, predictability and quality.
Among the many areas that were repeatedly explored was a relative newcomer to the world of survey research: straightliners, aka inattentive respondents. These individuals give undifferentiated responses in replying to ratings question. All or many brands are “excellent.” The data are not useful or interesting and there have been suggestions that these respondents should be eliminated from the sample because they have, presumably, not paid close attention to the tasks they were asked to perform. Most speakers/authors have suggested that the Internet method created this problem since no interviewers were present to make sure that the respondents “paid attention.”
Early on, the industry agreed that (another new term) speeders - those who finished questionnaires with implausible speed - were clearly gaming the system and they were deleted from most ending samples. The straightliners thus were consumers who had spent a sensible amount of time answering questions but frankly the answers they gave were not very discriminating and not very interesting.
At our firm, we decided to look into the straightliners more carefully. Our first step was to take a broad look at the issue. We combed our files for studies we had done over the prior 20 years that: obtained ratings on multiple brands; utilized a variety of data collection methods and; included a broad array of categories and respondents.
Our first step was to create a consistent definition of straightlining and test each data set for the presence of this issue. As Figure 1 illustrates, we saw that Internet studies were no more likely to show evidence of straightlining than other methods had in the past. Clearly, straightlining has always been a factor in our work - it is not restricted to Internet studies.
In the majority of studies respondents are asked to rate brands they have used on attributes relevant to the category. The task is simple: List the brands of ice cream (for example) you use and now rate these brands on “taste.” Think about that for a moment. Presumably all of the brands of ice cream you use taste pretty good. In fact, most of them might get an excellent rating. That could in fact be a perfectly rational and logical (if analytically boring) answer. Perhaps the straightlining problem is not the fault of inattentive consumers but frankly a logical and rational answer to a trivial question.
In looking further at work we have done, we examined an R&D project completed in the fall of 2008. Over 70,000 respondents were interviewed using a very carefully drawn national probability sample provided by Greenfield Online. Respondents reported restaurant usage daily over a seven-week period that correlated with sales of the leading 100 chains - highly accurate data. Beyond that, consumers rated their satisfaction with all chains they used “yesterday” on four key characteristics of restaurant performance. Figure 2 summarizes the ratings for McDonald’s versus the average for all quick-service restaurants (QSRs). Remember McDonald’s has a very dominant position in the industry. With brand share of over 30 percent and with no other restaurant enjoying more than 10 percent, McDonald’s is three times bigger than its closest competitor.
Figure 2 shows the consumer average rating of satisfaction with visits to a quick-service restaurant “yesterday.” McDonald’s, the clear leader in the industry by a factor of 300 percent, was last or nearly last on all four of the attributes. This made no sense. If the measures of attitude obtained bear no relationship to sales then there has to be something wrong with the way the measures of attitude are obtained!
Wonder about an alternative
As we thought about this problem we began to wonder about an alternative to ratings of brands among users. Part of the problem with ratings is that they permit ties. The alternative we focused on was ranking of brands among users. We hypothesized that this would be a harder question to answer but were hopeful that it would prove to be more useful (i.e., predictive of behavior).
We elected to do the work among users of QSRs again. We obtained a national probability sample from Greenfield Online - this time 2,325 respondents who had visited a QSRs in the past month. Respondents were shown the logos of the leading 35 QSRs in the U.S. and asked which they had visited in the past month and how often. The panel was then divided into two matched groups. The first group rated all brands they used on 12 attributes using an excellent-to-poor scale. The second group ranked the brands they used on each attribute.
We decided to test the alternative approaches’ ability to predict or correlate with behavior. A simple correlation between the alternative measures of attitude and the share of past four-week usage seemed appropriate. We looked carefully at alternative ways to use the data and decided on three:
-
Ratings data: percent who rate the brand excellent; average score on a five-point scale.
-
Rankings data: percent who ranked the brand No. 1.
Next we calculated the correlation of each measure with share (see table). Clearly, the ranks correlate more with behavior than the ratings.
The image advantage using this approach for McDonald’s is shown in Figure 3. McDonald’s image obtained from this metric was far more sensible. McDonald’s dominates other chains on key attributes like convenience, value, affordable, speed and family. This data helps to explain the reasons for behavior rather than being contradictory of behavior.
Like them a lot
If we take a long view of this issue it seems clear that consumers could and should “like” the brands they use - in fact they should like them a lot. In the QSR space consumers have myriad options but the average respondent chooses to use just six different ones. Presumably these chosen restaurants have passed an important test. Having made it to the inner circle of brands however does not guarantee equal success for all brands. Consumers make discrete choices every day (they only rate brands when annoying market researchers trap them into answering lengthy questionnaires!).
Consumers “break ties” every time they choose one brand or restaurant over another. These decisions may be difficult to make but some brand has to win on each occasion. The rankings approach taps into that process. It is clear that attractive contenders lose sales every day to brands that are slightly better. These may be tough choices but they are made daily.
We checked the interview length for the two approaches and found that the ratings panel whipped through the questionnaire in just seven minutes. They had a pretty simple task: provide obvious answers to trivial questions. The rankings panel took 11 minutes to complete their tasks. Obviously we obtained more attention from the respondents when we asked a more demanding question!
The rankings data requires each consumer to explicitly and repeatedly examine the brands they use and then select the best one. The frame of reference for each consumer is unique to them. It is an interesting metric and one that appears to provide superior actionable insight into the connection between attitudes and behavior.
We suspect that the relative stability that is common in much tracking work on attributes may be due the probable insensitivity of ratings data. We suspect that tracking of consumer attitudes might be improved by using the more sensitive ranking approach.
It seems to us that rankings offer a very attractive alternative to ratings, especially if consumers of the categories of interest regularly use multiple brands. It furthermore seems to us that straightlining is largely the creation of our profession - the result of the asking of questions which have an obvious answer.