The survey savior?
Editor’s note: Deborah Sleep is director at London-based Engage Research. Jon Puleston is vice president of research firm Global Market Insite Inc.’s interactive group in London.
In an article in last month’s Quirk’s (“The survey killer”), we went to the roots of respondent boredom and highlighted the consequences resulting from online survey respondents losing interest. This follow-up article reveals how technology that provides new question and response mechanisms can help researchers overcome this recurring challenge while making online surveys more interesting and easier to take for respondents. It presents the findings of the next phase of our research project, which compares the results gathered from regular online surveys with the ones from surveys using alternative question formats and interactive Flash elements.
Having identified some of the problematic effects of respondent boredom, the research explored a variety of ways to overcome the following issues:
• the general decay in engagement levels when completing surveys as respondents become bored;
• long grid and check-box questions causing dropouts and answer decay;
• repetitive questions causing dropouts;
• the sensitivity of open-ended responses to respondent engagement levels;
• respondents skipping past instructions and not reading them properly.
Accepted practice
The use of visuals and animation techniques is an accepted practice in most other forms of passive communication as a means of engaging consumers. Think of TV without it. Or imagine delivering a PowerPoint presentation to 50 people without adding in a few visuals or animating some of the bullet points. This begs the question: Why are these techniques so rarely used in surveys, where you may be communicating to upwards of 10 times this number of people?
Our belief was that adding animated elements and visuals into surveys could:
• trigger greater respondent interest in answering the questions;
• help communicate questions more effectively; and
• encourage respondents to spend more time thinking about the questions they were being asked.
To test this out, we conducted a series of experiments where we integrated visuals and animation into a variety of different question scenarios and measured the benefits they may have on enhancing the overall respondent experience.
A pool of 20 typical long survey questions were gathered, as well as a range of questions typically available in a new product evaluation survey conducted within the fast-moving consumer goods industry. A 20-minute test survey was designed, which from desk research was observed to be at the limits of the average respondent tolerance for completion. A series of 15 test surveys were conducted, made up from different sets of pool questions in different positions, using different question formats to explore various order and creative effects, with a minimum sample of 200 responses per sub-cell variants. In total, 3,300 respondents were surveyed as part of this research exercise, and 12 different question types and nine different question formats were tested.
The first test we carried out was the Shine experiment, where we explored using animation to stimulate interest in a new product development task. A concept for a new imaginary drink called Shine was created, together with a series of typical concept test questions. Over 1,400 respondents were surveyed, with different cells of respondents being exposed to the concept in different ways and at different points within a wider survey to measure the impact of boredom factors. The project tested and compared: a plain-text introduction vs. static mood board introduction vs. Flash-animated introduction; and asking the same questions at the start vs. in the middle of a survey.
Then we evaluated the completion times for the questions, the character of the data collected and the volume of open-ends provided - key measures of respondent engagement. We discovered that the introduction of an animated introduction led to respondents spending nearly 80 percent more time answering the follow-on questions, generating 50 percent more open-ended comments and thus getting more considered and complete responses.
Very effective technique
The Shine experiment showed the animated introduction to be a very effective technique for stimulating open-ended comments. This technique was then tested with three other open-ended question formats. Simple tests were conducted with split cells of 100 respondents, one half being asked the question with an animated introduction and the other half without.
Spontaneous ad-recall question: A standard spontaneous ad-recall question, “What ads do you recall seeing in the last two weeks?” was placed halfway through a survey. Prior to this, one cell of respondents saw a short animated sequence with a visual of a TV, poster, newspaper and Web page. This was set so that respondents could not press the [Next] button until the 15-second animation had finished. This very simple technique increased the average respondent word count by nearly 60 percent.
Respondent imagination question: More common to qualitative research, but indicative of the sort of question researchers might need respondents to really think about, a projective question was used, asking respondents what type of animal a product might be. Faced with this question halfway through a standard survey, nearly half the respondents skipped past it, probably because it required an oblique level of thinking that they thought would be too much trouble. The same question was then tried preceded by a visual animation showing a variety of animals that the product might be. An [I don’t know] button was included as a way of quantifying the interest level in answering such a question. The inclusion of the animation increased response levels by 25 percent.
Thinking question: Finally, another thinking-style question was tested, this time asking respondents to list the foods they don’t like to eat. To stimulate a response, an animation sequence showing a woman expressing disgust at eating something was added. This helped increase the word count by 90 percent.
Impact of wording
We demonstrated the value of stimulating respondents when capturing open-ended data, but other techniques might be used to the same effect.
The research continued to examine the impact of wording on stimulating responses. For the ad-recall question, we tried a different wording approach. The question “Please list all the ads you recall ...” was modified by adding the phrase “We would like to set you a test: please list all the ads you recall ...” Testing on two cells of 100 respondents, this small change in wording generated a staggering 88 percent increase in the number of ads recalled, indicating that this could be an effective technique.
Respondents were told they had three minutes to recall as many ads as they could. A counting-down clock increased the volume of responses by a factor of three (see Figure 1). While this is clearly a powerful technique for stimulating respondents, there are some issues to consider. The animation can impact the evaluation process, and the nature of the stimulus affected the nature of the responses. In the Shine experiment, for example, the way in which the concept was introduced had an impact on the way consumers rated it. The animations made respondents feel more positively about the product. So using these techniques would require some care and calibration, especially for judgment-style tasks. More research is needed to understand the nature of this halo effect and how to account for it.
Attempted to redesign
The next area we explored attempted to redesign some of the traditional question formats that respondents were getting bored with, to see if there were more effective solutions.
The common grid (matrix) question which dominates so many surveys has been a standard question format in online research since its inception. We conducted a poll in which answering long question grids was cited as one of the most frustrating things about completing online surveys. In the 550 surveys we analyzed, 80 percent contained a grid question and an average of three grid questions per survey.
Grids are widely used as an easy way to ask a bank of similar questions, and the nature of the data collected can be very valuable. The rationale for their being grouped in a grid, rather than being asked one by one as a set of radio or check-box questions, is that the respondent does not have to continually click the [Next] button to advance. However, interactive technology allows grid questions to be asked in many different ways.
We designed some new grid format questions from scratch to see if these could help improve the overall user experience and the quality of the data. From a respondent’s point-of-view, a matrix question is a very difficult format to read. Up to six eye movements are required to answer a question, as shown in Figure 2.
1. The question is read
2. The eye scans across to select the option
3. The eye scans up to check the column is right
4. The eye scans back down to select the option
5. The eye sweeps back to check the correct row
6. The eye moves to the answer
It is easy for respondents to make mistakes and, when a mistake results in an annoying error message, the sheer number of options presented to respondents all at once can have a psychological impact on how the question is answered.
By simply highlighting the rows that have been selected, two of the eye movements are eliminated, reducing the number of respondents who miss answering some rows. Introducing this measure reduced the frequency of respondents not answering the question properly, and then getting an error message, by more than half (Figure 3).
Even with highlighting, four eye movements are still required, so some interactive variants of these long question formats were tested further, starting with scrolling matrices. In the horizontal scrolling matrix, the options scroll in one by one, and scroll out as soon as the question has been answered (Figure 4). We also tested a vertical scrolling matrix, where the answers scrolled up the page (Figure 5).
Both of these new approaches require just two eye movements: one to read the question and the next to select the answer. [Back] buttons and a progress counter were added to allow respondents to review their answers and understand how many more options there were to answer.
We examined two other alternative approaches already in use as common alternatives to grid questions: sliders and drag-and-drop option selection (Figure 6). The research tested each of these different formats, with long sets of questions ranging between 30 and 45 options, against the traditional grid-question format, using spilt cells of at least 200 respondents answering the same question presented in different formats and placed at different points in the survey. The following factors were measured: the time spent answering these questions compared to standard grid questions; how the character of the data differed; how respondents reacted to these different questions at the start vs. the end of the survey; and the granularity of data from these different questions compared to conventional grid format questions.
Reacted positively
In a follow-up survey, respondents were polled about these different question formats and asked to rate each one in order to ascertain which they most liked answering. Figure 7 shows a summary of our findings by format. In all cases, respondents reacted more positively to questions asked in these alternative ways than they did to grids, especially the drag-and-drop and two scrolling matrix designs, which consistently delivered improvements in the time spent answering.
The analysis looked for respondents who had a tendency to give pattern answers of more than five options in a row, and compared the tendency to do this by question format, as well as looked at the average standard deviation of respondent-level data.
The drag-and-drop question format and the two scrolling matrix designs all showed significantly lower levels of pattern answering in controlled tests against standard matrix questions. Sliders, however, led to slightly higher levels of pattern answering. This does raise questions over the effectiveness of using sliders for long banks of question sets.
The answer patterns from these new question formats appear to be very similar to those of grid questions, with the exception of the slider format, which can give a distinctly different bell curve of responses.
In Figure 8 you can see a close match between the standard grid and custom vertical sliding matrix question. Slight differences are in line with differences seen due to engagement factors.
In this second example (Figure 9), the responses to matrix and drag-and-drop are similar. However, there are significantly higher neutral scores from the slider format question. This is likely due to the fact that, with sliders, respondents can make very small positive and negative movements that are grouped as neutral scores.
The best-performing of the two newly-created scrolling matrix designs appears to be the horizontal scrolling matrix, providing consistent answers with low levels of pattern activity and higher standard deviation, backed up with the highest respondent rating - nearly 90 percent said that they liked this question format.
Finite number
The research project also examined the question of how to improve responses to check-box questions. The key issue identified here is that with long sets of check-box options, there is a finite number of selections that respondents are prepared to make.
The possible solutions involved breaking the question into two parts, translating the question into binary selection choices, and using a custom question approach with visuals to make the question more interesting to answer.
This experiment used a standard prompted ad recall question, asking respondents to select the brands for which they recalled seeing advertising on TV, using a list of the top 30 TV U.K. advertisers.
When respondents were presented with the list on one page, with two columns of options, the average selection rate was 44 percent. However, when the question was broken out into two pages, each page presenting the respondent with 15 options, the average recall increased to 52 percent, representing an 18 percent improvement.
The next approach involved two different uses of a conditional yes/no question: a drag-and-drop format question and a custom format designed to make it easy for respondents to run through a long list of options very quickly, using brand logos to make the question more appealing. The average number of responses was measured, as well as the time taken to complete these two-question formats, then compared that to when the same question was asked as a multichoice check-box selection format, using split cells of 100 respondents.
The drag-and-drop yes/no option generated a 58 percent average recall, a 30 percent improvement. The custom-designed question format generated a 64 percent average recall, a 45 percent improvement. However, this increase in response has to be weighed against the extra time these questions took respondents to answer. The single multichoice question took respondents an average of 40 seconds to answer, while the two-page multichoice question took 48 seconds, and the custom question format 64 seconds - so 20 percent and 60 percent longer, respectively.
We only had a full set of data comparing selection rates at the start of the survey vs. the end for the standard multichoice and the drag-and-drop question formats. The average scores for the drag-and-drop question remained static while there was a 10 percent drop in the multichoice selection, and a 17 percent drop for the last third of the brands (Figure 10).
Got bored
At the end of each survey, we asked respondents if they got bored answering any specific questions, and prompted them specifically about these questions. An average of 7 percent identified the multichoice question as being boring to answer, increasing to 9 percent for the drag-and-drop format question, and dropping to 5 percent for the custom question approach.
We can conclude that, while respondents add a few more seconds to the process of completing a survey, these different approaches do not appear to make the survey process significantly more boring, and represent a good alternative approach for questions where it is important to secure an answer.
Respondents most commonly identify repetitive question sets as the most frustrating aspect of participating in online surveys, so this is another area we explored with a view to finding better solutions. We took a standard repetitive brand evaluation question set and reengineered it into an animated version with paired options. While this custom approach does appear to improve the granularity of the data, 30 percent of respondents singled this question out as being boring to answer when asked in the custom format style vs. 19 percent for the standard question approach. So this attempt to improve on the question format didn’t work, and further research and experimentation is required.
Combined improvements
Finally, we examined the combined improvements in survey content with regard to overall dropout rates and looked generally to the respondent experience of taking surveys in different formats. We tested the dropout rate with matched cells of respondents reach, using the same survey, but with different styles of content. We compared completion rates for a basic HTML version, a Flash shell version with only grid-format questions, a partially interactive version with a mix of question formats and a fully animated Flash version with the new custom question formats we had designed. Note also that respondents were given the same incentive to take part in these surveys, but this was slightly lower than for a normal survey of this length. This was a 16-minute survey, and respondents were given a standard 10-minute survey incentive.
Drawn as it is from only one survey, this evidence has to be described as anecdotal. However, the difference between dropout levels was measurable. In the fully interactive study, dropout rates were less than 10 percent, against 17 percent for the partially interactive, 23 percent for the shell version and 40 percent for the HTML variant (Figure 11).
We also measured the completion time for each survey variant. We found that respondents spent on average two minutes more to complete the fully interactive version of the survey compared to the shell version (16.8 vs. 14.8 minutes), which as we have seen is a measure of increased engagement (Figure 12). Note the slightly higher completion times for the HTML version, where we need to take into account extra question loading times (roughly three seconds per question). There is no waiting time for question loading in the Flash version.
Adding fun
So overall, what is this research telling us? Is interactivity the be-all and end-all of respondent boredom remedies? Hardly, but it certainly demonstrates that leveraging visual stimuli is one way of adding fun to online surveys to keep respondents engaged until the end, with the goal of getting more thoughtful responses and better data while providing a great survey experience.
Most of all, it demonstrates that the mindset of our industry needs to change, that it is time for us to address this critical issue holistically, and that we need to look at our respondents in a different light. Right now, we are taking our respondents for granted. Respondent engagement today is hanging by a fragile thread: their goodwill. Incentives are all very well, but they are not the only motivator. We need to think beyond compensation and consider the survey-taking experience - an enjoyable one being an integral part of the panelist reward process. Our respondents deserve surveys that are empowering and entertaining. Time to challenge traditional approaches and put our thinking caps on!