Editor's note: Marco Vriens is managing director strategic analytics and senior vice president methodology at The Modellers, a Salt Lake City, Utah, research firm. He is based in LaCrosse, Wis. Pat Kidd is client services senior vice president at The Modellers. She is based in the firm’s Princeton, N.J., office.

While tracking studies are great they can also be plagued by unexplainable movements of the metrics and/or a dearth of actionable insights generated. The result can be an erosion of credibility and acceptance by management and a lack of compelling action items, even if the results are seen as credible. Before changing vendors or stopping the funding of the tracking work, we recommend examining whether the situation can be improved. It often can, as we will explore in this article, if you use smart analytics to obtain more credible and more actionable insights from your tracking studies.

Erratic results

Counterintuitive or seemingly erratic results can erode confidence in the tracking results as a whole and decrease the likelihood that stakeholders will use the study findings.

When presented with this type of result, the first step is to be certain that confidence intervals have been calculated correctly across waves. This may seem obvious but we have seen many cases where this was not done properly. Tracking studies using a sample that requires weighting prior to reporting the results need to use a special formula to calculate the confidence intervals. Weighting often significantly increases the confidence intervals –  it’s not uncommon to see an increase of 50 percent or more. What seems to be an erratic movement of a metric can actually be perfectly within the bounds of the confidence intervals once the impact of weighting is taken into account.

A second remedy is ensuring that response-style effects are mitigated as much as possible. Obvious ones, such as the straightliners, can be easily identified and dealt with. Other response-style biases are harder to detect. Respondents can vary substantially in how they respond to typical discrete response scales (e.g., five-, seven- or nine-point rating scales). Some may tend to use the middle of the scale while others may prefer to be at the extremes. Such effects can occur any time but are probably worse in cross-cultural, cross-country research. This heterogeneous scale usage compromises any comparisons across countries, segments or waves. It also causes an upward bias in the covariances among various metrics (e.g., brand ratings) which, like dominoes falling, goes on to compromise driver (regression) models.

A common approach, centering of the data, is not a fix. A better fix is to apply an approach published in the Journal of the American Statistical Association1. This approach can be relatively easily programmed in R (a popular statistical analysis language that allows market researchers to quickly program specific statistical solutions that may not be available in mainstream packages such as SPSS or SAS). Apply this as a best practice data preparation step. This needs to be done prior to any driver analysis or clustering/segmentation as such analyses will be affected by a biased covariance matrix.

Halo responders, whose emotions about the brand or topic being explored spill over into all of their metrics, are more difficult to identify and correct for. However, it’s important to find them in a sample because their responses do not provide much information about individual brand attributes but are instead more a measure of how they feel overall about the brand. Halo response styles also result in multicollinearity, which makes it more difficult to determine drivers of key brand metrics. Recently models have been developed that separate the haloers from the non-haloers and as such can better identify the areas where the client performs well and where they perform less well.

Boost action potential

Dealing with haloers will increase the likelihood that the tracking results will identify areas for improvement. However to really boost action potential you need to do two things:

  • Make the case for causality (i.e., what is truly driving the primary variable of interest, be it overall satisfaction or brand preference) and use advanced analytics to identify the most likely drivers. 
  • Pinpoint competitive users most likely to switch to your brand.

Imagine you’re the marketing v.p. who’s just received tracking results that again show a decrease in the quarterly customer satisfaction scores when you’ve promised a 5 percent increase. Your market research team shows you a standard regression that shows 10 significant attributes. However, you feel ambiguous about this analysis because the effect sizes seem very small and the 10 attribute drivers are quite strongly intercorrelated, which raises the question if some brand metrics are impacting other brand metrics.

You should be worried. First, standard regression will not give a valid read on what the most likely drivers of satisfaction are and as a result cannot inform the decision maker on improvement priorities. It is known that multicollinearity (i.e., highly intercorrelated potential drivers such as the satisfaction attributes) results in a high number of marginally significant but small (uninteresting) effects. More importantly, one cannot interpret these individual regression coefficients, which leads to suboptimal investments and missed opportunities.

This can be one of the reasons why executives are reluctant to act on driver models (see “The cure for infophobia” in the March 2013 issue of Quirk’s). From a consumer or respondent behavior point of view there are two dynamics we need to account for.

One, respondents who provide haloed responses should not be used in the driver (regression) modeling because they violate what we are trying to achieve in our analysis: identifying the most likely drivers of satisfaction. For these haloers we know that their overall satisfaction causes their responses on the individual brand attributes, not the other way around.

Two, some brand metrics may have an impact on some other metrics, not just the end overall metric (e.g., overall satisfaction). For example, if you improve product quality it could be that the ratings of “quality of technical support” go up too, simply because the customer issues become less severe. So improving product quality may have a direct effect on overall satisfaction but it can also have an indirect effect because it improves the quality of technical support perception.

Dealing with halo response styles and estimating indirect effects will yield more interesting (bigger) effects and more differentiated results, making it easier to prioritize, and the identified drivers are more likely to be truly causing satisfaction. Doing this is possible but it’s not straightforward. Two recently-published papers2,3 have the ingredients for this modeling strategy: one discusses how to deal with halo respondents and the other discusses how to estimate indirect effects. We applied this combined modeling approach on several datasets and we found exactly what statistical theory says we should find: fewer statistically-significant drivers but with a higher average effect size and more differentiation and a stronger case for causality.

Imagine you are the marketing director of a global credit card brand and in your annual commitment you pledged to acquire 5 percent net new customers. You have a survey in which a sample of prospects has rated your brand along with a number of competing credit card brands on a set of brand attributes, including, say, an overall brand favorability rating. The decision you need to make is, who should be mailed credit card offers and what marketing message should be used that would lead to the highest conversion rate? You could just mail to only those respondents who already give top two-box overall favorability ratings. The problem is, there aren’t too many of those and even if all of them respond (i.e., acquire the credit card) it won’t get you to your goal. So, you need a smarter approach: the switchable consumer approach.

In this approach we first run a logit regression model that helps us predict what brand perceptions are most predictive of whether a person is a top two-box overall favorability person or not. Once we have this model that is estimated at the sample level, we can use it to calculate individual probabilities for being in the top two box, versus the middle box, versus the bottom box. There is usually quite some differentiation between the respondents who rated the brand somewhere in the middle of the overall favorability scale (say, they gave it an 8, 7, 6 or 5 on a 10-point rating scale). Some will show a higher than 50 percent likelihood to be in the top two box; others have a (much) lower than 50 percent likelihood of being in the top two box. A respondent with a more than 50 percent probability to be in the top two box, but an actual rating in the middle, is called the switchable consumer. Yes, currently the brand is rated in the middle box but on the other scores, especially on the drivers of overall favorability, this respondent’s scores are very similar to a respondent who rates the brand top two-box.

This means that these prospective customers are really on the fence: Make your brand slightly more appealing (e.g., by exposing them to the right message) and they become likely top two-box raters and as a result now become highly likely to respond to your marketing efforts. This approach has been tested and validated. ABB Electric made more than $100 million using switchable consumer insights4,5 and we have successfully applied the approach in our commercial work.

Squeeze value

So, if your tracking study seems to be getting stale, turn to advanced analytics to squeeze value of the data and in the process save yourself the trouble of onboarding or transitioning the study or giving up on it altogether.

References

1 Rossi, P.E., Z. Gilula and G.M. Allenby. “Overcoming scale usage heterogeneity: a Bayesian hierarchical approach.” Journal of the American Statistical Association. 96, 453 (2001), 20-31.

2 Chandukala, S.R., J. Dotson, J.D. Brazell and G. Allenby. “Bayesian analysis of hierarchical effects.” Marketing Science, 30, 1 (2011), 123-133.

3 Buschken, J., T. Otter and G. Allenby. “The dimensionality of customer satisfaction survey responses and implications for driver analysis.” Working paper. 2012.

4 Vriens, M. The Insights Advantage: Knowing How to Win. Bloomington, Ind., iUniverse.com. 2012.

5 Gensch, D., N. Aversa and S. P. Moore. “A choice modeling market information system that enabled ABB Electric to expand its market share.” Interfaces, 20, 1 (1990), 6-25.