Three steps to clarity

Editor’s note: Michael Feehan is co-founder and CEO, and Penny Mesure is a director, of Observant LLC, a Waltham, Mass., research firm. Cristina Ilangakoon is a senior statistical analyst at CTB/McGraw-Hill, a Monterey, Calif., publisher of assessment solutions for the education markets.

To quote Benjamin Franklin, “By failing to prepare, you are preparing to fail.” While this aphorism is frequently used in the sporting world by coaches to reinforce the necessity of practice before a competition, it can easily apply to the world of qualitative research. If we as qualitative market researchers do not gather our data and prepare it for review in a way that is conducive to analysis, we are in effect preparing for an arduous and inefficient (read stressful) analysis process, and run the risk of missing the mark in generating powerful insight. Taking the time to structure one’s data before diving in to any analytic process can be invaluable.

In general, qualitative data (of whatever form) is gathered, structured and then analyzed with the aim of developing themes and drawing associations between those themes to advance understanding on the phenomenon under investigation. Such qualitative research data may come in many forms, whether transcripts of one-on-one or group interviews, transcripts of online bulletin boards, observational data in ethnographic studies or verbatim responses captured in the context of quantitative studies.

In applied or consulting settings this analytic process must often be conducted in a milieu where: a) the objectives are typically set in advance; b) the aims are set by the information needs of the funding body; c) time frames are limited; d) there is often a need to link the data to other quantitative information, and e) the raw data can be extremely voluminous. In the British Medical Journal, Pope et al., (2000) propose an inductive “framework approach” that reflects the accounts and observations of those studied but involves a more structured data collection process than is seen in some other forms of qualitative research and leverages an analytic approach more strongly informed by a priori reasoning. This process generally involves the steps of familiarization (immersing oneself in the raw data); identifying a thematic framework; indexing (coding with text descriptors); charting (beyond grouping verbatim text and incorporating researcher abstraction and synthesis); and finally mapping/interpretation (interpreting the phenomenon and providing explanations). This is a useful heuristic that provides a map for effective qualitative research and analysis in the commercial sector, even if individual qualitative market researchers may use differing terminology or jargon.

In this process, researchers are essentially engaged in a reductive process, refining and distilling what can be very large volumes of data into manageable units for subsequent analysis and interpretation. In large multi-site qualitative studies (e.g., doing market opportunity assessments in a cross-national study) the volume of data generated through purely qualitative interviews can be enormous. Similarly, in large N quantitative studies which allow for open-ended responding, the market researcher can be faced with several thousand responses that need to undergo a reductive process to abstract key insights of relevance to clients. Rather than hand off an Excel file of verbatims to a data processor for coding (simply grouping verbatims under topic headers), we recommend an approach whereby the qualitative researcher first structures and examines the data, prior to establishing some kind of thematic coding frame.

We recently conducted a quantitative study of the perceptions consumers hold about their primary care physicians and tested alternate questionnaire design approaches to measure attributes describing doctors and the salience of these attributes to their patients. As part of this project we included an assessment of the likelihood to recommend a doctor, both quantitatively through a rating scale and the calculation of the industry standard Net Promoter Score (NPS) and also through the collection of open-end verbatim responses accounting for that prior rating.

Here we describe an approach to analyzing these qualitative verbatim data that was later leveraged in a large multi-country study of customer loyalty in the consumer electronics market.

A cardinal measure

Across companies as diverse as American Express, eBay, Jet Blue Airways, Symantec, Verizon Wireless, Apple, Amazon, P&G and Merck, executive teams increasingly rely on Net Promoter Score as a cardinal measure of customer loyalty and a key indicator to measure and track their brand performance over time.

This measure, described in author Fred Reichheld’s The Ultimate Question as a “foolproof test” (p. 18) highlights the proportion of promoters of the brand relative to the detractors of the brand, in terms of their response to a single question on a 0-10 point scale: “How likely are you to recommend this company/product to a friend or colleague?

In May 2009 we conducted an online survey of 394 respondents representative of the general U.S. population and asked them: “How likely would you be to recommend your primary care doctor to a family member or friend who was looking for a new doctor?” Using standard criteria, respondents were classed as promoters (P: 9 and 10), neutrals (N: 7 and 8) or detractors (D: 0 through 6). NPS was calculated as NPS = P - D. Our sample contained 61 percent promoters and 17 percent detractors. The NPS for a family doctor in the U.S. is therefore +44, a number that many commercial organizations would love to achieve (albeit rather below the figure expected for a luxury sports car).

Open-ended responses were then gathered to “explain why you gave this rating.” This question resulted in 342 verbatim responses from the 394 respondents. Some of these verbatims were sparse and others more verbose; and on cursory examination some addressed a single concept (e.g., “She understands me.”), while others addressed more than one (e.g., “a nearby office and friendly staff”).

The database was structured and the verbatims analyzed in three stages. The key goal was to use standard tools in most analytic software (SAS, SPSS, Excel) to break down the verbatims into manageable units of text that would allow for speedy review by a researcher.

Step 1: Assessing the volume of response

Prior to exploring the substantive nature of these verbatims, we first addressed a hypothesis that those who would be more likely to promote or detract their physician, would simply have more to say. That is, a quick metric to gauge “strength of feeling” would be the simple volume of total text associated with each response. To do this, the number of characters used in each verbatim were calculated (using a function in SAS), as were the number of words used.

Both metrics confirmed our hypothesis in that promoters and detractors said more than those in the neutral category. On average promoters used 17 words (70 characters) and detractors used 19 words (76 characters) while neutrals used only 13 words (55 characters). This makes sense: satisfied patients are bubbling over with good things to say, while those who would not recommend their family doctor justify their position at length. Neutrals were just neutral. We are not aware that word/character counting is a standard feature of NPS analysis, but this quick metric was beginning to give us a picture of who these respondents are and their strength of feeling about their doctors (or what would be brands in other contexts).

Step 2: Scoring favorability of each unit of analysis

Each verbatim was then broken down into the separate text units, comprising a different idea or aspect of the verbatim, using punctuation delineators (i.e., periods, commas, colons/semi-colons, along with and/&), with the exception of periods after “Dr.” Despite similar levels of volume in aggregate, there were significantly more text units per promoter (2.4) than per detractor (1.8). Promoters were more likely to laud their doctor with multiple reasons. Detractors were giving fewer overall reasons, but were using more words to get things off their chest about a single dimension of their doctor’s behavior or style.

This then allowed the research team to very quickly code the overall tone of each text unit as positive (+1), negative (-1) or neutral (0). We could then explore the balance of tone across the NPS groups, both in terms of the absolute number of positive and negative reasons given, but also the ratio of positive to negative. This empirically allowed us to quickly gauge the degree to which detractors still said something positive about their doctor, and if promoters still identified areas for their doctor’s improvement. As shown in Figure 1, detractors do make some positive (and neutral) comments which could provide important clues as to how to improve the quality of service and thereby convert them to promoters.

Step 3: Qualitative coding of themes

At this level each verbatim was then easily coded according to a theme (or themes) it encapsulated, with the list of potential themes being built up progressively and iteratively as text units were reviewed. This was very efficiently done as each complete verbatim had been restructured for rapid review through its composite text units. We already knew quite a lot about our promoters and detractors from the initial analyses. From this next level of analysis we were able to look at the underlying themes in their responses and draw some inferences about underlying drivers of loyalty to the family physician. Looking at the main themes (Figure 2), we can see that recommendation is primarily driven by the doctor’s personality and ability to communicate with his or her patients. This is more important than medical expertise, which may be a given. Poor perceived time management and efficiency generally stand out as reasons why people give lower scores and are less likely to recommend their physician.

Baseline assessment

We recently applied this approach for one of our clients in the consumer electronics industry. We conducted a baseline assessment of NPS across nine brands, with around 150 consumers reporting on each brand, across seven countries. This gave us 9,450 survey responses. While not every respondent provided a verbatim response, since some verbatims comprise multiple text units and ideas, the volume of responses that could inform strategic strengths and weaknesses for the brands is considerable.

To aid this analysis we created a structured database, in a fashion similar to the one described above, that allowed us to review and analyze the verbatims quickly and without undue burden on analyst staff. Without going into the detail of this company’s NPS and process metrics, we were able to produce high-value reports efficiently that yielded key insights into drivers of loyalty.

Analysis of the nearly 10,000 respondents’ qualitative data revealed that among promoters sound quality was one brand’s key strength, though durability and value for money also emerged as important: “It has the highest sound quality and it is reasonably priced. Also, the durability and longevity of the product are the best of any audio equipment I have ever owned. I am a fanatic!”

Analysis of the detractor verbatims highlighted the chink in this brand’s armor as being cost: “It is only an average value for the money you pay. If they were to decrease the prices and/or increase the quality, I’d be more likely to recommend.”

May seem like overkill

To some qualitative market researchers, this multi-stepped and (at least in the initial stages) pseudo-quantitative approach may seem like overkill. Especially when the number of open-ended responses is comparatively few and the task of identifying themes (across aggregate verbatim responses) may not be too challenging.

However, there are two major advantages to doing this. First, in many corporate NPS studies the sample sizes (and resultant volumes of verbatims) can be simply staggering. For example, in the construction equipment rental market, Peterson (2008) cites researcher Ellen Steck at RSC Equipment Rental, who obtained 23,000 completed customer surveys per year. Larger corporations may generate many times that number.

Many companies may simply focus on the quantitative NPS scores generated and, in the absence of some structured approach, neglect the value of a qualitative analysis of their often times un-analyzed verbatims. Key levers for positive change may thus be missed. In these cases some form of computational algorithms should be used to reduce and structure the verbatim data in order to minimize research time and do analyses as efficiently as possible. Second, some of the provisional metrics may themselves be useful data to track. An early indicator of improving fortune in the loyalty wars may be things like the volume of words customers say about your brand, the number of ideas they reference about your brand or the ratio of positive to negative ideas.

In terms of next steps it would be useful to analyze other NPS verbatim data and abstract text to develop positive and negative adjective batteries. This will allow researchers to search for text in the verbatims and code each as positive or negative (as opposed to manually by a researcher). While not perfect, this level of automation is necessary in studies where the researcher may be working with 30,000-40,000 verbatims. Once their direction is coded, subsamples of positive and negative verbatims can then be reviewed and analyzed by the research team for thematic content.

One interesting perspective that should not be overlooked in this approach really comes from an appreciation of the questioning style of qualitative researchers. The way open-ended responses are gathered in the industry-standard NPS assessment is to rely on the recommended question: “What is the most important reason for the score you gave?” (Reichheld, p. 33). This closed form of questioning can lead to simple lists of reasons without direction or strength of conviction (e.g., “The cost” or “Its quality”). We recommend that clients gathering NPS use an alternative: “Why did you give this score?” This simple change will encourage richer verbatims (e.g., “The terrific quality used to be worth the cost, but isn’t now with cheaper competitors available.”)

More efficiently focus

In sum, by proactively structuring the qualitative data using simple text-editing tools, and conducting preliminary counts of key verbatim types, the qualitative market researcher can more efficiently focus on what is critical - abstracting their key insights form very large verbatim sets.  


References

Peterson, L.M. (2008). “Strength in Numbers.” International Rental News, 8(3), 37-41.

Pope, C., Ziebland, S., Mays, N. (2000). “Qualitative Research in Health Care: Analyzing Qualitative Data.” British Medical Journal, 320, 114-116.

Reichheld F. (2003). The Ultimate Question: Driving Good Profits and True Growth. Boston: Harvard Business School Press.