Anticipate and plan

Editor's note: Keith Chrzan is chief research officer at Maritz Research, St. Louis.

Linkage consists of the formal, statistical process by which we connect input measures from one data source to output measures from another. Varying levels of sophistication differentiate “simple” and “complex” linkage. Complex linkage involves a multivariate predictive model of some sort which enables what-if simulations, while simple linkage involves a matching of a single input variable from one data set to an output variable in another - for example linking a survey measure to a behavioral measure in order to validate the survey metric.

Five potential obstacles threaten the validity of linkage models, particularly of complex linkage models. Failing to anticipate and plan for these could cause a catastrophic failure of the linkage model.

1. Level of aggregation

In all linkages that we do we need to make sure that the survey and external data are prepared at the same level of aggregation. For example, we might have satisfaction survey data for customers who report their time spent waiting to check into a given hotel property but staffing data for the property as a whole, not for the time the individual customer walked in the door. In this case we cannot link the data until we aggregate the respondent-level satisfaction information into a mean score for the entire property. Three problems can result when we do this:

  • We may not have enough respondents at the property level to compute a stable mean score.
  • We may not have enough observations at the aggregated level to run a stable analysis. For example, if we had 1,000 respondents per property but only 10 properties, that becomes just 10 observations in our aggregated analysis, not nearly enough to allow us to reach conclusions with confidence.
  • When we do aggregate respondent-level data together into means, we often lose a lot of the variability and we end up with very “flat” data. This flatness will often produce very small effects in our linkage analysis.

The fix for potential level of aggregation problems resides in the blueprinting stage of the project. Make sure you have a complete inventory of the data that will be available for linkage and have all the relationships among variables mapped out in detail so that mismatches in levels of aggregation will be evident.

2. Model specification

When we build a statistical model to predict some outcome, the model is properly specified only if it contains all the variables that influence the outcome. To the extent it fails to do so it exhibits underspecification or misspecification. For example:

  • We link a customer’s subsequent repurchase (or not) of a brand of automobile to her overall satisfaction with her previous experience with that brand. But whether a person buys a given brand also depends on whether competitors have come out with new models, new advertising, new features, new prices, spiffs or rebates, etc. Predicting repurchase based only on satisfaction with experience of a single brand misses parts of the story and a distorted view of relationships may emerge from such an underspecified model.
  • We link customer satisfaction with a bank branch to cross-sales data at that branch. But how many IRAs the branch sells to its checking account customers isn’t just a matter of the satisfaction of those customers, it’s also a function of the population served by the branch, the income, employment and life stage of its customers, and the larger economy.

Ideally we prevent specification error by conducting a blueprinting session and identifying all the factors that affect our outcome variable.

Additional aspects of model specification to consider include:

  • Missing data. Sometimes not all respondents experience, or can rate, all aspects of the potential relationship with the brand.
  • Non-linear effects. The influence of attributes on outcomes may not be linear, which will alter our modeling approach.
  • Sampling. We need to make sure the sample on which we base our model adequately represents the population to which we want to forecast.
  • Timing of prediction. Current measures of performance (satisfaction, etc.) may be more strongly related to future outcomes than to current ones, suggesting a longitudinal linkage.
  • The direction of causality. Sometimes we see that business units with higher levels of satisfaction also have lower sales volume. Rather than infer that satisfaction causes lower-sales, we usually find that lower sales business units are less busy and can offer customers more attention.
  • Whether the effects are best measured as cross-sectional or not. Cross-sectional analysis focuses on the differences between high and low satisfaction respondents, say, or between high-performing and low-performing business units; when, as in the previous point, volume itself may be masking the true relationship, running analysis within business units (or at the respondent level) may yield a more accurate measurement.
  • Whether non-compensatory processes drive outcomes. Most models assume compensatory, linear processes (e.g., a low score on one attribute might be made up or compensated by high scores on some other attributes). Plausibly, some real processes may instead be non-compensatory: If my wireless carrier has poor coverage in my primary usage area, it may not matter at all that the service rep is friendly, the contract clear and the bills small – the provider might be unacceptable based on coverage failure alone. Indeed, when we use our firm’s new Make or Break model of customer satisfaction we invariably find non-compensatory processes not captured by more common modeling techniques.

3. Model complexity

The number of variables

Some of our complex linkages contain many variables. By simple division, if we have 40 variables, each will average only 2.5 percent of the total influence on the outcome. Even large changes to individual variables will move outcome variables too little to get excited about or to interest senior management.

Like the other potential problems, it’s best to fix this one in the design stage – make the number of variables in your predictive models no larger than required for proper model specification.

Bolting

Bolting is the vivid image describing how multiple sources of survey data connect. For example, we may connect some transactional surveys to a relational survey and the relational survey to some outcome metric. The bolting itself contains error, the result of two distinct processes:

  • Transactional surveys and relational surveys take place in different contexts and the questions within them, even if they have identical wording, may be biased by the contexts in which they’re asked. There may be no way around this problem but it pays to be aware of it and to make the contexts as similar as possible.
  • The samples that comprise transactional and relational surveys may be very different - and different in complex ways not easily sorted out by simple weighting. To name just one example, many of our transactional surveys do frequency-based sampling (e.g., frequent customers receive more surveys) while our relational surveys often do not.

An analogy to accompany the vivid term might be that differential context and sampling effects mean the bolt is too skinny and wiggles around in the hole, while the nut that attaches to the bolt has the wrong-size thread, perhaps in the opposite direction. All else being equal, one should try to minimize the amount of bolting built into a linkage model.

4. Multicollinearity

Multicollinearity happens when two or more variables move up and down together as you look across respondents. If multicollinearity occurs among three retail attributes, such as “wide aisles,” “well-lit parking lot” and “wide selection” (and we often see this occur for just such seemingly unrelated variables), then the three tend all to be high when any one of them is high and all to be low when any one of them is low (i.e., they’re highly correlated). So when we go to simulate an improvement in “wide aisles,” say, if we just simulate it the way it doesn’t happen (all by itself, with no rise in the other two attributes’ scores) we’re punishing the attribute, relative to the way it occurs in nature (or in our data set).

Multicollinearity leads to a cluster of technical problems, most importantly that it results in making the model unstable - it adds error to the importances that result from the model, making the results potentially very misleading (some importances may be a lot smaller than they should be, while others may be a lot larger).

The best solution to the multicollinearity problem is to fix it in the design stage of the study: carefully pretesting and building attribute lists to refine their psychometric properties. Alternatively, we can opt to base the importance weights in our linkage models on valid stated importance measures like best-worst scaling, Q-sort, the method of paired comparisons or constant sum scaling. The multicollinearity problem vanishes if we’re not trying to derive importances from halo-affected, highly correlated predictors. Most often, however, we use a variant of regression analysis based on information theory – Theil’s relative importance analysis – that appears to be robust with respect to multicollinearity.

5. Respondent heterogeneity

Respondents differ from one another; they have different tastes and preferences and hot buttons. You can imagine if we were able to do a separate importance model for each respondent, people would have different weights for the different attributes – that’s what respondent heterogeneity means.

When we run a derived importance model, we get a set of attribute importance weights that represents the average importances across all respondents. Of course this average can hide some pretty important differences; in fact, the average may represent no one respondent at all. Linkage models applying these average importances to every respondent in the study produce inaccurate predictions for many of the respondents, resulting in poorer overall prediction.

We have several tools in our analytical toolbox for addressing this problem of heterogeneity.

  • If the linkage involves connecting individual respondent-level data to individual-level outcomes, consider using valid stated importance methods (best-worst scaling, the method of paired comparisons, Q-sort or constant sum scaling). When done properly, stated importances have been found to have excellent predictive validity. Stated importance measures come at a cost in questionnaire real estate, however, so in many cases they may not be the best solution.
  • Other times we can examine segmenting variables to see which have a moderating effect on regression coefficients, then incorporate them into a moderated regression analysis.
  • Finally, if we lack or if we want to supplement known segmenting variables we can use latent-class versions of Theil’s importance analysis or of PLS modeling – methods which simultaneously quantify drivers and driver-based segments of respondents.

Take action

Don’t leave anything to chance – make sure to consider these five threats and take action to reduce or eliminate them.