Intelligent (survey) design
Editor’s note: Michaela Mora is president of Dallas research company Relevant Insights LLC.
The advent of user-friendly online survey tools in recent years has created the illusion that anybody can write a survey questionnaire. After all, how hard can it be? It’s like asking questions in a conversation, many think. However, there are many methodological issues to consider when creating a questionnaire if you want to gather high-quality data in a survey. The following are 10 issues that arise in survey design.
1. Data collection mode
Some questions may elicit different answers if asked in an online survey, a telephone interview, a paper survey or a face-to-face interview. While words in phone surveys or in-person interviews are given more importance because of the conversational format, visual design elements have a bigger impact in how questions are read and interpreted in online surveys. Be aware of the types of questions that are a good fit for online surveys.
2. Respondent effort
There are questions that put a heavier burden on the respondent’s working memory and comprehension or are likely to elicit higher non-response if asked in different data collection modes. Experience tells us that asking a ranking question with 10 items over the phone can overwhelm respondents. In online surveys, rating questions in matrix format with a large number of items increases fatigue and boredom and often leads respondents to adopt a “satisficing” behavior. Satisficing occurs when respondents select the same scale-point to rate all items without giving them too much thought. They go for the most effortless mental activity trying to satisfy the question requirement, rather than work on finding the optimal answers that best represent their opinion.
3. Question wording
Formulating a question with the right wording so it accurately reflects the issue of interest is one of the hardest parts in writing questionnaires. You may have seen political polls getting different answers depending on how a question is crafted. Data errors can creep into a survey if we use unfamiliar, complex or technically-inaccurate words; ask more than one question at a time; use incomplete sentences; use abstract or vague concepts; make the questions too wordy; or ask questions without a clear task.
Another issue related to question wording is the risk of introducing bias by leading the respondent in a particular direction. I recently received a mail survey sponsored by the Republican Party to represent the opinion of voters in my congressional district and one of the questions was:
“Do you think the record trillion-dollar federal deficit the Democrats are creating with their out-of-control spending is going to have disastrous consequences for our nation?”
Could this question be more biased? The use of adjectives such as “record,” “out-of-control” and “disastrous” makes it really clear what the expected answer is and what the intentions of the sponsor are.
4. Question sequence
Questions should follow a logical flow. Order inconsistencies can confuse respondents and bias the results. For instance if you are measuring brand awareness and ask respondents to recognize brands they are familiar with before asking which brands first come to mind, you are rendering the results from the latter question worthless since respondents can’t avoid thinking of brands they just saw in the first question. This seems basic, but it happens.
5. Question format
Questions can be closed-ended or open-ended. Closed-ended questions provide answer choices, while open-ended questions ask respondents to answer in their own words. Each type of question serves different research objectives and has its own limitations. The key issues here are related to the level of detail and information richness we need; our previous knowledge about the topic; and whether to influence respondents’ answers. For example, for closed-ended questions we need to decide what the answer choices should be and in which order they should appear. This requires we know enough about the topic to provide answer options that capture the information accurately.
6. Information accuracy
Some questions yield more accurate information than others. Respondents can answer questions about their gender and age accurately, but when it comes to attitudes and opinions on a particular issue, many may not have a clear answer. Overall, attitudes and opinion questions should be worded in a way that best reflects how respondents think and talk about a particular issue so that we can tease out information that is difficult for the respondent to articulate. However, some questions need to be skipped when they don’t apply to the respondents’ experience or the issue is so irrelevant to the respondent that s/he doesn’t have a formed opinion about it. In the case in which attitude statements appear grouped in a matrix format and some may not apply to a respondents (e.g., a customer satisfaction survey after a phone call to customer support), it is necessary to include a “Not sure/Don’t know/Not applicable” option to avoid introducing measurement error in the data.
For example, the other day I received an online customer satisfaction survey from BlackBerry after a call I made to its support desk. The survey had a question in which I was asked to rate the representative who took my call on different aspects. One of them was “Timely Updates: Regular status updates were provided regarding your service request.” I wouldn’t know how to answer this, since the issue I called for didn’t require regular updates. Luckily, they had a “Not applicable” option, otherwise I would have been forced to lie, and one side of the scale would be as good as the other.
7. Measured behaviors
People tend to have less-precise memories of mundane behaviors they engage in on regular basis, and usually they do not categorize events by periods of times (e.g., week, month and year). We need to consider appropriate reference periods for the type of behavior we want to measure. Asking “Have you purchased any piece of clothing in the last seven days?” will yield a more accurate behavior measure than asking “Have you purchased any piece of clothing in the last six months?”
Measured behavior should be relevant to the respondent and capture his or her potential state of mind. This is valid particularly when we use rating questions and have to decide whether to include a neutral mid-point. A lot of research has been conducted in this realm, particularly by psychologists concerned with scale development, but no definitive answer has been found and the debate continues. Some studies find support for excluding it while others for including it depending on the subject, audience and type of question.
Those against a neutral point argue that by including it we give respondents an easy way to avoid taking a position on a particular issue. There is also the argument that equates including a neutral point to wasting research dollars, since this information would not be of much value or at worst it would distort the results. This camp advocates for avoiding the use of a neutral point and forcing respondents to tell us on which side of the issue they are.
However, consumers make decisions all day long and many times find themselves idling in neutral. A neutral point can reflect any of these scenarios: we feel ambivalent about the issue and could go either way; we don’t have an opinion about the issue due to lack of knowledge or experience; we never developed an opinion about the issue because we find it irrelevant; we don’t want to give our real opinion if it is not considered socially desirable; or we don’t remember a particular experience related to the issue that is being rated.
By forcing respondents to take a stand when they don’t have a formed opinion about something, we introduce measurement error in the data since we are not capturing a plausible psychological scenario in which respondents may find themselves. This is yet another reason to include a “Not sure/Don’t know/Not applicable” option in addition to a neutral point.
8. Question structure
Questions have different parts that must work in harmony to capture high-quality data. These are the question stem (e.g., What is your age?), additional instructions (e.g., Select one answer) and response options, if any (e.g., Under 18, 19 to 24, 25+). The wrong combination can leave respondents baffled about how to answer a question. Consider the example below.
Overlapping answer options:
What is your household income? Select one answer.
1. Under $25,000
2. $25,000 to $50,000
3. $50,000 to $75,000
4. $75,000 +
So, which answer should I choose if I my household income is $50,000? Is it option two or option three?
Conflict in meaning between different parts of the question:
Please indicate the products you use most often. Select all that apply.
Cell phone
Toaster
Microwave oven
Vacuum cleaner
Here, the question stem gets confusing by asking for “products,” which suggests multiple answers, and also asking for the “most often used,” which suggests a single answer. The additional instructions also indicate that multiple answers are allowed. Due to these inconsistencies, there is a lot of ambiguity in the question, leading some respondents to choose a single answer and others to select multiple answers. So, how do we know the actual frequency of usage of these products? We will never know. Some products will be underrepresented and some will be overrepresented. Would you trust the data from this question? I wouldn’t.
9. Visual layout
Using design elements in an inconsistent way can increase the burden put on the respondent in trying to understand the meaning of what is asked. For example, encountering different font sizes and colors across questions forces the respondent to relearn their meaning every time they are used.
Also, presenting scales with different directions (positive to negative or vice versa) in rating questions within the same survey increases measurement error as respondents often assume all rating questions have the same scale direction even when the instructions explain the meaning of the end points of the scale. For instance, if a preference question using a 1-7 scale where 1 means “the most preferred” is followed by an importance question, also using a 1-7 scale, but where 1 means “the least important,” respondents who are not paying attention to the instructions (which is quite common) are likely to assume that the 1 in the importance question means “the most important.” I have seen many examples of this problem, when respondents are asked a follow-up question conditioned on their previous answers and then they realize their mistake and tell us they actually meant to say the opposite.
10. Analytical plan
Based on the research object, both the type of information requested and the question format are important for the type of analysis we plan to perform once the data is collected. If you want to develop a customer satisfaction model using linear regression analysis and the dependent variable is an open-ended question, you can forget about modeling anything. This seems obvious, but I have seen non-researchers writing questionnaires without thinking how they will analyze the data and then come to me asking for analyses that are not appropriate for the data collected.
There is also the question of whether we want to replicate the results, track certain events or just run a one-time ad hoc analysis. If the goal is to track certain metrics, time and care should be dedicated to crafting tracking questions, as slight changes in wording can change the meaning of a question and thus its results.
On your way
If you take each of these aspects of survey writing into consideration, you will be on your way to creating surveys that produce valid data and can support with confidence strategic and tactical decisions for your business.