Watch out for dropouts
Editor’s note: Allen Hogg and Jeff Miller are, respectively, director of marketing and senior vice president of Burke Interactive, the online research and reporting support organization at Cincinnati-based Burke, Inc.
It is common knowledge in the research industry that the percentage of respondents dropping out of Web surveys will rise as questionnaires lengthen. It is less known how such mid-survey terminations affect study findings.
Recent experiments conducted by Burke, Inc. and SPSS MR show that questionnaire dropouts can indeed impact results — particularly on concept tests, the most common use of Web surveys.
Respondents indicating they were relatively uninterested in an interactive television concept described in a recent survey were far more likely to drop out of a long version of a questionnaire about that concept than were relatively uninterested respondents taking a shorter version of the survey.
The effect of this higher dropout rate is to inflate concept interest scores. Scores among those completing the long survey version were significantly higher than they would have been if those who dropped out of the survey were also included in the findings.
In a parallel experiment with a customer loyalty survey, the pattern was not so pronounced, but there was at least directional information suggesting those less satisfied with their experiences at selected restaurant chains were more willing than other respondents to continue to the end of a long version of the survey. The findings certainly again demonstrated how results could vary depending upon whether those who dropped out were included or not.
Experiments
The experiments took place in March 2003 using respondents recruited to AOL’s OpinionPlace Web site. OpinionPlace respondents, who are distributed to research studies by SPSS MR, are asked to indicate how much time they have available for surveys and offered variable incentives - AOL credit or American Airlines frequent flier miles - based upon the length of the study to which they are assigned.
For both the concept test and customer loyalty survey experiments, respondents were assigned to one of three cells. The first cell for each experiment consisted of respondents who indicated having the minimum amount of time - 15 minutes - to complete a survey and were assigned to take a shorter version of either the concept test or customer loyalty questionnaire. These shorter questionnaires took people, on average, six to eight minutes to complete. Those who completed the survey received a $2.25 credit or 150 American Airlines frequent flier miles.
The second cell for each experiment consisted of respondents who also indicated having 15 minutes to complete a survey. These respondents, however, were assigned to complete longer questionnaires designed to take up all of those allotted 15 minutes. (In fact, these surveys actually turned out to take respondents, on average, almost 19 minutes to complete.) Respondents in this cell who completed their surveys also received a $2.25 credit or 150 miles.
A third cell for each experiment also consisted of respondents assigned to take the longer survey versions. These people, however, had indicated that they would have up to 30 minutes to complete the survey - and were also given a more substantial incentive, a $4.50 credit or 300 miles, for finishing the task.
Between 193 and 214 completed surveys were obtained for each cell of each experiment. For the concept test, in addition to indicating interest in the interactive television service described in an initial statement, respondents assigned to take the longer survey version were asked to rate their interest in more than 50 specific possible features of the service and propose prices for different levels of service. For the customer loyalty study, respondents assigned to the longer survey version were asked to rate the importance of 45 specific restaurant attributes, as well as how well two different chains performed on those attributes, plus some additional questions about dining habits not asked of those taking the shorter survey version.
Concept test results
As expected, the longer version of the concept test did have a higher mid-survey termination rate (23 percent) than the shorter version (13 percent) did. The termination rate was particularly high (29 percent) for those who were assigned to take the longer survey version, but indicated they had only 15 minutes to complete the study and received the lower incentive. Among those who indicated they had 30 minutes and received the larger incentive, the dropout rate was 16 percent.
Table 1 shows that, for the longer version of the survey, the dropout rate was not consistent across concept interest levels. Those less interested in the concept were much more likely to terminate the questionnaire. Three out of 10 people taking the longer survey version who indicated they were not very or not at all interested in the interactive television concept dropped out, compared with just 18 percent of those who were somewhat interested and only 13 percent of respondents who were extremely or very interested.
By contrast, dropout levels for the shorter survey version were fairly consistent across interest levels, suggesting that, for concept tests of less than eight minutes in length, it is something other than a lack of interest in the survey topic that prompts people to drop out.
The effect the varying dropout rates can have on survey findings becomes clear when one examines results among those who were assigned to the longer survey version, but only promised the smaller incentive for completing it. Among those in this cell who did complete the survey, 22 percent indicated that they were extremely or very interested in the concept, and just 13 percent indicated that they were not at all interested. Among those in this cell who did not complete the survey, only 13 percent indicated being extremely or very interested in the concept, and 28 percent indicated being not at all interested.
The common top-two-box measure of interest in a concept is therefore inflated because the long survey prompts less interested respondents to drop out. On the flip side, bottom-box rejection of the concept is underestimated if only those completing the survey are counted.
Although dropout rates can be reduced if the survey population is restricted to those willing to take longer surveys in return for greater incentives, it should be noted that changing the population in this way will also affect results. Among respondents who were willing to take a 30-minute survey and completed the questionnaire, 39 percent indicated being extremely or very interested in the interactive television concept, while just 8 percent indicated being not at all interested. It appears that people willing to spend more time on a survey might be more disposed to give more positive responses.
Customer loyalty study results
For the customer loyalty study, the contrast in termination rates was even greater than it was for the concept test. Only 6 percent of respondents assigned to the shorter version of the survey did not complete it, while 28 percent of those assigned to the longer survey version dropped out. Among those who indicated having only 15 minutes to spend on a survey, 37 percent dropped out, while the dropout rate was 17 percent among those willing to spend a half hour.
Examining the issue of whether dropout rates varied by satisfaction levels produced limited evidence. Table 2 suggests that, for the longer survey, people less satisfied with their restaurant experiences were perhaps somewhat more inclined to complete the survey (although the differences were not statistically significant).
It should be noted that, to qualify for the customer loyalty study, respondents had to indicate that they had eaten at two or more of 10 of the most common casual dining chains in the United States (Applebee’s, Bennigan’s, Chili’s, Hooters, Lone Star Steakhouse & Saloon, Red Lobster, Ruby Tuesday, Olive Garden, Outback Steakhouse, and T.G.I. Friday’s). The restaurants they were asked to rate were selected at random from among all those they said they had eaten at, with a preference given to three restaurants to ensure there would be enough ratings of particular restaurant chains to make comparisons while holding the chain rated constant.
So we aren’t seen as spreading good or bad news about any particular restaurant chain, we will call the restaurants given preference in selection “Apple-Chili Garden,” “Lone Redback,” and “Ben and Ruby Friday’s.” Looking again at the satisfaction question, Table 3 demonstrates how responses can again differ depending upon whether those not completing the survey are included in the findings. (Findings here are limited to those who agreed to spend just 15 minutes on a survey and were assigned to the longer survey version.)
Although the pattern was not consistent for the three restaurant chains, the data certainly shows that the decision whether to include non-completers in the results could impact perceptions of the relative performance of the three restaurants. Focusing on results based only upon opinions of those who completed the surveys would make it appear that the percent “very satisfied” with Apple-Chili Garden and Lone Redback is quite similar, with both doing substantially better than Ben & Ruby Friday’s on this measure. Adding in findings from survey non-completers, however, widens the gap between Apple-Chili Garden and Lone Redback substantially, and in fact makes Lone Redback scores closer to those obtained by Ben & Ruby Friday’s.
Just as in the concept test, respondents willing to spend more time on the study again tended to provide more positive ratings. Across all restaurants rated, 54 percent of satisfaction ratings given by those who completed the longer customer loyalty survey and indicated being willing to spend a half hour on it were “very satisfied,” while only 47 percent of satisfaction ratings given by those who completed the longer survey but only indicated having 15 minutes fell in this category.
Implications
Obtaining different survey findings because respondents dropped out of surveys can be seen as an extension of the common problem of non-response error. Non-response error occurs when those who do not respond to survey invitations might be different in important ways from those who do. (This can be a particular problem when survey invitations tell too much of what a study is about.) Similarly, those who do not complete surveys can be different in important ways from those who do finish.
Although some evidence of this effect was seen in the customer loyalty experiment, it might be an especially likely happening in the concept testing arena. Because people who drop out of these surveys tend to be less interested in the concept being tested than individuals who do not, reporting findings based only on the opinions of those who complete the survey would tend to result in biased, inflated estimates of concept interest. As Web studies lengthen, they would seem to be more prone to this bias.
The outcomes of these experiments therefore reinforce the notion that efforts should be made to minimize mid-survey termination rates by keeping Web questionnaires short. It is, of course, a bad idea to put extra, unnecessary questions in any sort of survey, no matter what the data collection method. The temptation to add extra questions on the Web can be great, however, given that there often seems to be very little incremental cost incurred for doing so. In fact, the cost can be huge - causing the entire set of study findings to result in questionable estimates.
If studies cannot be kept short, researchers might get more reliable estimates of the opinions of a population by including in findings survey responses offered by those who terminated the questionnaire, as well as those who completed it. Such an action would, of course, go against years of marketing research tradition and could make compiling and reporting study results a more difficult and confusing process.
Researchers could also work to circumvent the issue by adopting standard surveys for concept testing purposes. Because concept test results are often compared to survey scores received by products previously introduced to the marketplace, the relative appeal of a particular concept is often more indicative of its likely marketplace success than the absolute numbers resulting from the survey. If standard concept tests are used, bias caused by survey termination would likely still exist, but it might be fairly consistent across sets of studies of concepts of similar appeal.
One thing survey researchers should not do is ignore this issue. As Web surveys lengthen, dropout rates will increase. If results are compiled based only on respondents who complete the questionnaires, findings likely will be affected. Researchers should not let their desire to ask more questions lead them down a path to biased data.