Improving the customer experience 

Editor’s note: Jeff Mercer is a senior director at technology company Microsoft. Katherine Costain is a director at Microsoft.

Microsoft’s Customer and Partner Experience (CPE) Relationship study is one of the largest satisfaction tracking programs in the world. It surveys Microsoft’s customers and partners – from small businesses to global enterprise companies. It’s offered in 45 languages, spans over 170 countries and generates almost 100,000 responses in a six-month period. 

The CPE Relationship study measures the health of Microsoft’s relationship with its commercial customers and partners and acts as a powerful listening system to understand their needs, pain points and the drivers of satisfaction. Teams across the company, worldwide, use the metrics and insights from this research to develop targeted initiatives to improve the customer experience. 

The challenge: minimum sample requirements  

To ensure we provide representative and stable scores from the CPE Relationship study, we require a minimum sample size of n=100 for any reported metric. While we easily reach minimum sample requirements for segments at the worldwide level (e.g., enterprise accounts), we often fall short of that threshold for segment and country parings (e.g., enterprise accounts in New Zealand). For any given wave, we were previously unable to report satisfaction scores for ~50% of segment and country pairings, leaving account leaders with little insight into the satisfaction of their customers and partners in those countries.

This sample-size threshold created a knowledge gap for some countries and posed a challenge for us – how do we provide customer satisfaction insights for segments with low response in certain countries? And more importantly, how do these country leaders improve the experience with Microsoft for their customers and partners? Are current strategies and programs working? We had qualitative data and verbatims for these smaller groups, but that wasn’t enough. We needed a new way to provide quantitative data that we felt confident in, so that stakeholders could make smarter, more informed, data-driven decisions. 

The solution

To solve this challenge, we worked with our analytics vendor Success Drivers to use boosted Bayesian neural network machine learning, to model two key satisfaction metrics for countries with low sample sizes:

  1. Overall satisfaction with Microsoft.
  2. Account team quality satisfaction.

Creating the model

The goal of our model is to provide a reliable estimation or prediction of satisfaction for any given segment and country pairing. The model uses the measured data we collected from our survey and then adjusts that score based upon a set of predictors. 

We use the following conceptual framework to identify which variables to include in the model:

  1. Starting-level predictors: These variables help determine the starting level of a score for a particular segment and country pairing. For example, the past wave score of a segment and country pairing helps to determine the starting level of the next wave’s score. 
  2. Change predictors: These variables inform which direction (if any), the score should move from the prior wave’s results for a segment and country pairing. For example, the model is more likely to predict an increase in satisfaction for a segment and country pairing if satisfaction scores increased for highly correlated pairings.  
  3. Representativity predictors: These variables are used to understand how confident we can be that the measured score is representative of the market. This is done by understanding how the current time period compares to the average representation across all time periods. This includes looking at the share of high weights or additional representation variables that have shown to impact scores. 

To bring this all together, let’s look at the following example: to calibrate the satisfaction score for enterprise commercial customers in France, the model could leverage one or more of the following inputs:

  1. Starting-level predictors: Past satisfaction scores for enterprise commercial customers in France.
  2. Change predictors: Satisfaction scores for other segment and country pairings that are correlated to the enterprise commercial customers satisfaction score in France (e.g., medium-sized commercial customers in France).
  3. Representativity predictors: The share of respondents who have direct engagement with a Microsoft representative in the past three months (a metric that has shown to provide differentiation in satisfaction scores) compared to the average share in prior waves.

The final step of the process is to apply a formula that takes into consideration sample size, the measured score and the modeled score. As sample size increases, our confidence that the measured score represents the market also increases, which needs to be accounted for in the final modeled score output. The formula places more weight on the predicted score when sample sizes are small and more weight on the measured score when sample sizes are large.

Validating the model

The model was validated by running thousands of simulations to assess how close modeled scores came to estimating actual scores. Leveraging segment and country pairings where we had very large sample sizes and high confidence in the measured results, we tested how well the model predicted the satisfaction score at varying sample sizes. Our models proved to effectively predict the actual score even with low sample sizes. For example, a modeled score based on a sample size of n=50 predicted the satisfaction score with less error than if we had been able to survey 100 customers! 

The model was further validated by graphing both the measured and modeled score over time. These long-term graphs show that the modeled scores from low sample sizes followed similar patterns to measured scores, but with reduced volatility. In addition, the measured and modeled scores from high sample sizes mapped almost exactly to each other. Both validations provided us with confidence that the modeled scores are a strong representative predictor of satisfaction scores for any given segment and country pairing. 

Understanding the satisfaction of customers and partners

Given we are confident that modeled scores provide reliable estimations of segment and country performance, we are using these scores in our semi-annual reporting when sample sizes are between n=50 and n=99.* This process has increased our coverage of reported segment and country combinations by ~20% for overall satisfaction and ~25% for account team quality satisfaction.

Since applying this innovation in our CPE Relationship Program, we have received enthusiastic feedback from our stakeholders. Modeled scores have become one measure that country and segment leadership can use to corroborate and understand the satisfaction of their customer and partners. We look forward to continuing to see the impact our modeled scores have on leaderships’ ability to improve the customer experience with Microsoft.


* We will continue to use measured scores for when sample sizes are n=100+.