Data Use: Best practices for well-differentiated questionnaire data

Editor’s note: Vince Raimondi is director of marketing sciences at The Marketing Workshop Inc., a Norcross, Ga., research firm.

A while back, I heard a nugget of wisdom that resonated with me and has become a core belief in my career as a marketing scientist. I'm not even sure where I heard it (yikes!) but the gist of the wisdom is this: The role of any good marketing scientist/analyst/statistician is to uncover important differences in data. If everything we report is "the same" or "constant" then we really haven’t provided any meaningful value.

I try to stay focused on how analytics can deliver better and more critically-important insights and I’m intent on never losing sight of this. If we are to achieve these goals, as both researchers and analysts, we must work with well-differentiated data.

I’ve thought back over my own experiences and I’ve come up with some surefire ways of achieving greater differentiation in survey data. These techniques may or may not be news to you. But, even if you’re quite familiar with them, it's equally important to revisit them, as a means of keeping them top-of-mind when drafting your next questionnaire.

So, here we go.

Longer point scales

Even though this is fairly straightforward and intuitive, many researchers stick with tried-and-true five-point continuous rating and Likert scales. Admittedly, this makes sense when comparing mean responses to research standards (especially for likelihood-to-purchase comparisons). But when this comparison is not required, moving to a seven-, nine- or 10-point scale will reap the benefits of greater data differentiation. Seven- or nine-point scales maintain a neutral or indifferent midpoint, which is commonly required. Ten-point scales work well when rating product attribute importance/performance.

Example 1 shows a standard, five-point “likelihood to return” question. Converting it to a seven-point “likelihood to return” question (Example 2) is simple and innocuous.

Alternate labeling of scale points

A slight change in the wording used for scale point labels, especially at scale end points, can go a long way. When considering alternative scale point labels, try to capture a longer continuum of consideration, while keeping an eye on the desired information needed.

Example 3 shows a suggested label enhancement to Example 2’s seven-point scaled question.

Rank order questions

When asking a respondent to rate, say, a lengthy list of brands in terms of reputation, two or more brands will commonly receive the highest rating the respondent gives. Why not ask a follow-up question about which brand has the best reputation, among all those receiving the respondent’s highest rating?

For simplicity, and to keep respondent completion time and fatigue to a minimum, a full rank-ordering of the entire consideration set is not required. Rank-ordering makes more sense to further evaluate top-rated alternatives. That’s where consumers make their most important purchase decision trade-offs anyhow, since a second tier of brands, products, etc., is rarely afforded significant thought in the minds of consumers.

This approach does require a back-end adjustment of ratings on the part of an analyst/stat person. This adjustment is a form of statistical interpolation but, nonetheless, is fairly straightforward to apply. Example 4 shows new reputation ratings for three brands that each received a respondent’s highest marks (9s on a 1-10 continuous rating scale) among those brands rated.

Straightlining checks and question traps

Straightlining is a respondent’s undesirable repetition of the same responses across multiple questions or, more prominently, across a multiple consideration set for a single question. Respondents engage in straightlining behavior when looking to complete a survey quickly, with little or no regard for the accuracy of responses they provide. This results in data not only lacking differentiation, but just being plain wrong (a double whammy).

No doubt this is an age-old problem. But many firms, including ours, have implemented safeguards to combat this behavior and eliminate straightline data from respondent samples. At our firm, we’ve gone as far as to set up mechanisms to let panel companies know which respondents are engaging in straightlining behavior so they may take appropriate action. We also backfill the respondent sample with a different respondent while the study is still fielding.

It’s the responsibility of the researcher to take advantage of today’s on-the-fly question logic technology (for online surveys) and insert appropriate question traps to identify this behavior proactively. One question trap I particularly like is requesting that the respondent provide a specific response (“Please mark a ‘5’ for this item.”) for an item embedded in a list of items being rated.

Choice-based methodologies

This is a favorite one at our firm. Granted, it also ventures into a more complicated subject matter area but the efforts made to embrace choice-based methodologies return high dividends.

Simply put, designing a robust choice-based experiment and scoring choices via max-diff analysis provides well-differentiated data. While the other tips provided here return what some would consider marginally more differentiated data, choice-based approaches undoubtedly provide data that is significantly more differentiated.

The benefits originate from the algorithm commonly used to estimate utility scores from the choices respondents make. This algorithm, hierarchical Bayes estimation, provides utility scores with a theoretically unlimited range and can be transformed to shares of preference or shares of purchase intent (or share of anything, really - depending on how the choice tasks are worded).

Let’s look at an example of how superior choice-based share results are in terms of differentiation, compared to mean ratings results.

Mean attribute ratings typically produce ratings with little differentiation across attributes, as shown in Figure 1. Shares of preference that are computed reflect improved range differentiation over mean attribute ratings, as shown in Figure 2.

Top-box proportions

I kinda cheated by including this one. I say that because this really has nothing to do with the way the data is collected. We’re not talking about the way a survey question is framed here. Instead, we’re talking about utilizing and reporting out top-box proportions (or top two-box or top three-box) as a summary m easure in place of means.
For those not familiar, top-box refers to the proportion of respondents providing the highest value on a continuous ratings-scale-based question. So for example, a top-box percentage of 55 percent on a five-point scale question means that 55 percent of valid responses received were a 5 and the other 45 percent were either 1, 2, 3 or 4. (In some scenarios, the top-box measure includes "don’t know" or "refused" responses in the denominator.)

What is intriguing about top-box proportions is that, despite effectively consolidating scaled rating values down to a binary value set (1 = top box indicated, 0 = top box not indicated), we wind up with summary measures that are more differentiated. Analytic folks may do a double take when considering this, and I admit I’ve engaged our project managers in lively debates on whether this is appropriate to do.

What I’ve concluded is that using top-box proportions is a mainstay in market research and will continue to be into the foreseeable future. Its ultimate value, besides greater differentiation, lies in its utter simplicity and intuitiveness.

The researcher does need to consider how top-box proportion results are to be interpreted. My thinking is that their greatest value lies in being an indicator of customer delight. Thus, some parallels between top-box consideration and consideration of results from the Kano model of customer satisfaction can be drawn.

Furthermore, much depends on the client and the client’s position in the market. If the client is ready to take the next step to being a market leader, then leveraging top-box proportions as a measure of delight is very relevant. However, if the client is lagging market competitors, and drawing at least even to competitors in critical product/service areas is most important, then top-box measures would be inferior to mean ratings or even bottom-box measures.