Seeking the correct diagnosis
Editor’s note: Andrew Aprill is founder of BioVid, a Princeton, N.J.,research firm. Matt Campion is executive vice president, market research business, at Epocrates, a San Mateo, Calif., software firm.
Remember that big decision you made last week and the physician-based market research findings that influenced your decision-making? Did you ever stop to wonder who answered your survey questions and if they provided thoughtful responses? As pharmaceutical market research continues its rapid migration to online surveys, data quality is increasingly being questioned by end users. We applaud this scrutiny and offer the following systematic approach to maximizing the quality of physician-based survey data.
Step 1: Assess panel quality and examine the physician authentication process.
While there are many ways to recruit physicians to participate in online surveys, one popular method is to recruit from established physician panels. Panels are attractive because they can provide fast, reliable and cost-effective access to physicians that are ready and willing to participate in online surveys.
Before they are ready to serve you, panel providers have to construct their panel by recruiting members. Common recruitment methods include mail, fax-blasting and telephone. Frequently, physicians recruited via these methods are offered some type of sign-up incentive. Sometimes panels are created from preexisting networks of physicians where membership offers benefits beyond the opportunity to participate in surveys. Whatever the panel construction approach, the key is that the panel provider has established a relationship with panel members that results in robust and reliable response rates. Methods to verify that the person joining a panel is indeed a physician and not a clever layperson registering as a physician in order to earn lucrative honoraria vary widely across panel providers. Survey stakeholders should examine the verification process of their panel provider by asking the following questions:
-
How was the panel created?
-
Is the panel a genuine panel with an established relationship with opted-in physicians or simply a database?
-
What is the specific process for gathering physician credentials?
-
What credentials are gathered and how secure are they?
-
Are these credentials matched against an official database of known physicians?
-
Is the credential-matching process an automated “hard” match or a manual “fuzzy” process requiring human intervention?
-
Are physicians verified at the time of registration or after the fact?
-
What other safeguards are used to discourage fraud?
Step 2: Monitor and manage physician survey-taking behavior.
Panel management and monitoring can also impact data quality. Like verification, it is the panel provider’s responsibility to adhere to these panel management principles, and it is the survey stakeholder’s responsibility to confirm that these principles are being followed. Keep in mind that though verified physicians are taking your survey, they may not always provide thoughtful responses to your questions. Since Internet surveys are self-administered, participants are in control. While most physicians take their survey participation seriously, there will always be a few respondents who seek to “expedite” their participation. Following are five things to insist on from a panel provider:
-
Blacklisting of speeders/cheaters - those physicians who repeatedly demonstrate disengaged behavior while taking surveys.
-
An established privacy policy certified by an outside agency (e.g., TrustE).
-
Adherence to the CAN-SPAM Act.
-
Whitelisting with major Internet service providers to ensure delivery of e-mail invitations.
-
Systematic purging of non-responders.
The primary symptom of disengaged behavior is a physician who completes the survey significantly faster than his or her peers. In addition to speeding through a survey, other signs of a disengaged physician include straightlining rating-scale questions (e.g., rating everything a 4 on a seven-point scale), giving internally inconsistent/illogical answers and failing trap questions (e.g., “Doctor, if you are reading this question you will know to circle a 5 in the answer categories below”).
While a handful of physicians speeding through a survey and providing less-than-thoughtful answers may seem insignificant, such behavior can cause significant skews in study data and misleading findings.
There are two ways to address this problem, each occurring at different times in the online survey process:
Pre-survey warnings. Make clear at the beginning of any physician survey that in exchange for fair-market honoraria respondents are expected to take the appropriate time to thoughtfully answer questions. It should also be pointed out that their survey-taking behavior will be monitored in real time and if any suspicious behavior is detected they will be warned to stop it or their participation will be terminated and they will not receive their honorarium.
Real-time monitoring of survey-taking behavior. Today’s survey platforms are sophisticated enough to allow tracking of response times on a question-by-question basis. To prevent physicians from completing surveys too quickly, “speed bumps” can be inserted into surveys. These speed bumps warn physicians that they are completing the survey faster than their peers and to slow down and give thoughtful answers to each question. Straightlining through answer grids or failing trap questions should trigger similar warnings or survey termination.
While pre-survey warnings and real-time monitoring of survey-taking behavior may address the respondent who has agreed to participate with good intentions, these interventions only serve to exacerbate the problem for one particular respondent type. Namely, the respondent who has no intention of providing thoughtful responses, but who is only interested in collecting an honorarium - or worse, who has malicious intent to provide bogus answers. These people will alter their behavior so as to not be detected by any active monitoring, making it that much more difficult to find and remove them from a dataset.
Impact the interpretability
At this point in the process, we have verified that we are interviewing real physicians and are monitoring their behavior during the survey to encourage them to “do the right thing” and take our survey seriously. But what happens if they don’t?
There are several areas where the influence of unmotivated respondents can impact the interpretability of research results. Two key areas include univariate statistical testing and multivariate statistical analysis.
In univariate statistical testing, the goal is to understand differences between different items or respondent groups. For example, in pharmaceutical research, we often compare different specialties on their attitudes, interests and opinions about different products. In the case of comparing physician specialties, unmotivated respondents who straightline blur the true differences between different specialties and can result in “no statistical difference” when in fact there is a difference.
Another typical example involves physicians rating different products on the same set of attributes (e.g., efficacy, safety, etc.). The straightliners as well as the “tree trimmers” (i.e., respondents who make random patterns of responses in an attribute battery, resulting in patterns that resemble a tree) hide real differences in product perceptions, leading to a potential false conclusion that there are no statistical differences between the products on those attributes.
In multivariate analysis, the behavior of unmotivated respondents can lead to statistical problems in multivariate models as well as misleading conclusions from the analysis. Using the same example, a regression analysis predicting likelihood of product use based on product ratings on different attributes can be hampered by two factors. First, the correlation between product attributes is increased with straightliners and decreased with tree trimmers, which leads to a lack of statistical significance on the relationship between the product rating on the attributes and the likelihood of product use. Second, the relationship between the likelihood of product use and the rating on the product attributes is inflated by straightliners, leading to a faulty statistical relationship that may change the interpretation of the relative importance of each attribute in predicting the likelihood of product use. On the other hand, tree trimmers will reduce the strength of the relationship between particular product attributes and the likelihood of use, leading to a false conclusion of weak or no statistical significance. Both of these effects lead to biased conclusions in different directions.
At this point we hope you are asking, “OK, so how do we fix this problem?”
Step 3: Analyze respondent responses and validate data sets.
Reputable marketing research firms have a process for checking the validity of data files prior to beginning any analysis. In the not-too-distant past this was a laborious, subjective undertaking. However, most commercially-available data-analytic software packages today have a specific routine that makes data cleaning fast, efficient and objective. Basically, these routines uncover outliers in a data set (based on criteria set by the end user), which then allows those outliers to be removed from the analysis. Most often this is done at the response-item or question level rather than at the respondent level. For example, on a question asking about monthly patient volume, an answer of three compared to all other answers in the range of 300 would be identified as a clear outlier that should be dropped from inclusion for that specific question.
For questions that have a wide range of potential responses, this is a fairly reliable way of identifying outlying responses. However, there are limitations to data-cleaning approaches that rely heavily on identification of response outliers:
-
Such an approach is less sensitive to questions that have a more limited range of potential responses, such as a rating scale.
-
The validation is completed at the question level not the respondent level.
-
The key is to be able to identify a respondent who is not engaged, who provides responses that are not actually representative of his or her opinions and behaviors.
-
It is entirely possible that many of this type of respondent’s answers would fall within an acceptable range and thus go undetected by standard data-cleaning routines. But if these respondents remain in a data set, any conclusions drawn from the research can be inaccurate or downright misleading (see sidebar).
-
Obviously, this type of respondent should be removed from any data set before beginning analysis. The real question is how to accomplish this in an objective, responsible fashion.
Removing a respondent from a data set should only be done as a last resort, and with clear, objective information as to why they were removed. Respondents should not be removed simply because their answers “look funny” or are counterintuitive. And without an objective standard, there are ethical considerations that weigh on the decision to remove a respondent. Doing so could very well change the conclusions drawn from a study.
The approach that we take in evaluating survey respondents proceeds in three phases:
-
Questionnaire design
-
Unusual response detection and respondent scoring
-
Review by a domain expert
Questionnaire design
The questionnaire is designed with domain-specific checks for logical consistency. For example, the same domain-specific question is asked in different ways in an attribute battery so that a respondent who has a logically consistent opinion would answer one question positively and the other negatively. These types of domain-specific checks are built into a questionnaire to identify logically inconsistent data that could greatly skew results.
Unusual response detection and respondent scoring
The domain-specific questions are coded into a respondent scoring program which creates a score for each respondent based on the number of inconsistencies for that respondent. A report is generated that identifies respondents who could potentially skew research conclusions with their responses.
Review by a domain expert
The results from phase two are used to identify respondents who warrant further review. The report builds a case for further investigation by domain experts who review each respondent’s data individually to determine if the pattern of responses builds a logically consistent story for the research. It could be that the unusual response pattern provides some critical insight that offers new information that should be considered in the conclusions of the research. Without a critical review by domain experts, this type of respondent might be removed automatically by a software program.
Chain of responsibility
In marketing research, as with most things in life, there is an easy way to do things and then there is the right way to do things. Validating survey respondents, monitoring real-time responses and systematically evaluating those responses all fall into the category of “the right way.” However, marketing research clients cannot assume that all of these steps are being taken by vendors simply because it’s the right thing to do.
There is a chain of responsibility as it relates to data integrity and all parties involved in that chain share responsibility for ensuring that marketing decisions are being made based on valid information: The client has a responsibility to inquire about what steps are being taken; the vendor has a responsibility to proactively initiate processes to ensure greater validity; and the respondent has a responsibility to be engaged and provide thoughtful responses. The impact of any marketing research project is only as strong as its weakest link.
The takeaway changed
A study was commissioned by a client who was introducing a novel therapeutic agent that required infusion. A key marketing issue was whether MDs perceived infusion therapies to be inherently more risky than traditional oral therapies in this class. Our goal was to understand how perceptions of infused therapies as risky would affect intent to use this new product.
Our validation process was applied to the data and indentified X percent of the sample as “unengaged respondents.” A regression analysis was performed on the complete data set (including the unengaged) and the validated data. The analysis of the cleaned data produced a better regression model, with an increase in model fit (a 5 percent increase in r2 after X percent of the sample was removed).
Also, the interpretation of the results changed. The conclusion from the uncleaned data showed that a physician who perceived no more risk associated with infusion therapy was more likely to use the client’s product. Once the unengaged respondents were removed from the data, the conclusion was actually reversed: physicians who perceived more risk with infusion therapy were less likely to use the product. A key management takeaway completely changed when unmotivated respondents were included in the analysis.