Editor’s note: Kevin O’Donnell is managing director of San Francisco Research Services, LLC, San Francisco. Peter Brownstein is a senior account executive with Survey Sampling, Fairfield, Conn.
The best marketing researchers we know are also some of the most pragmatic practitioners of their craft. That’s no coincidence. They understand which rules are more flexible than others, and under which circumstances.
For instance, take the case of sampling low-incidence populations. Until recently, most marketing researchers had adhered to the law of randomness (mostly random-digit dialing), remaining safely within the defensible - albeit expensive - method. As a result, researchers (and their clients) paid hefty field bills for screening through thousands of randomly generated numbers to find qualified respondents.
But as long as the client had enough budget and time (and you designed the right screener), one could be highly confident of finding the correct target population. Completing an RDD study with an incidence of 5 percent or less was something of a hard-earned -- if bruising -- badge of honor. Though marketing researchers are near-masochistic (almost by definition), low-incidence studies are almost always inefficient.
(A distinction here is important: Market research often demands RDD sampling for objectives like broad census-taking [e.g., "adoption" and "penetration" of new technology]. Marketing research, however, most often is directed toward highly actionable marketing solutions, such as changing consumer attitudes and behavior. The latter often studies very specific populations that are narrowly defined by behaviors, attitudes, inclinations, demographics, and so on.)
Real-world solutions
Samples of low-incidence populations have made unnecessary much of the pain. Broad databases of consumers with recorded behaviors, interests, demographics, etc., can yield highly productive marketing research results. A comparative real-world example of two similar studies of consumers of premium wines is illustrative.
March 1999 |
January 1998 | ||||
Completes |
600 |
250 | |||
|
|
| |||
Selected Screens |
|||||
Age Screen |
4% |
7% | |||
Not Wine/Alcohol Drinker |
3% |
20% | |||
Final Incidence |
18% |
8% |
- Study "A" was conducted in January 1998, with an RDD sample.
- Study "B" was conducted in March 1999, with a sample of records randomly selected from a database of listed households with a self-reported interest in wine.
Accuracy and efficiency: pragmatic equals
A critical difference between the relative efficiencies of these samples is found in the proportion of consumers in the category "wine and alcohol users." The RDD sample in 1998 screened out one-in-five (20 percent) contacts who did not consume wine or alcohol - six times greater than the one-in-thirty-three (3 percent) in 1999.
As important in a practical sense, the 1999 study yielded .64 completed interviews per hour compared with .46 in 1998 - a 39 percent increase in productivity. From a time- and cost-efficiency standpoint, the improvements are significant. Yet, savings of time and money are indicative of how well the sample accomplishes its functional objective.
Keep in mind that any sample drawn from a self-reported database is subject to sampling frame error and there is no way to weight or adjust responses to compensate for such an error. In the case of most databases of this nature, certain elements of the real population are systematically excluded in the questionnaire-derived databases because they did not respond or were not given the opportunity in the first place. The important consideration here is the fact that this study did not attempt to make inferences about the entire universe of wine drinkers. It had an objective to provide marketing direction and a course of action, and that’s what it accomplished.
Determinant considerations
Two questions weigh on the decision to use listed database sampling methods:
1. Does the database contain elements that most nearly match the definition of "population under study"? For example, the interest categories of cultural events/classical music merged with household income above $50,000 can approximate the market of high propensity symphony attendees.
2. Is the database large and recent enough? Fast-growing markets like on-line subscribers may not be represented adequately in the database. As that market segment grows rapidly, it also changes demographically quarter-to-quarter. A listed sample of on-line subscribers more than one year old is hopelessly unrepresentative.
Samples from listed databases of consumer behavior and preferences relieve much of the pressure on screening and qualifying vast numbers of randomly selected households, but it does not eliminate the need altogether. Though these samples draw from self-reported sources, common sense and the production numbers above underscore the importance of careful screening. The listed database and careful screening combine to yield more meaningful results for practical researchers who are at the headwaters of the marketing decision-making process.
Database sources
Samples of this sort are usually created from databases compiled by questionnaires delivered by direct mail, magazine insert and other vehicles. In many cases different databases are being developed from different sets of questionnaires that cover the same or similar selections. A skilled sampling vendor understands the subtleties of question language, distribution, response and other items that distinguish the databases so they can offer the most appropriate solution. They also control the freshness of demographic data and maintain the accuracy of the phone and/or address information. The best sources work with multiple databases to create the most representative samples possible, while still within the world of highly targeted samples drawn from compiled databases. These samples offer significant practical advantages in the field.
Cost and time savings alone are not adequate justifications to abandon RDD methods. The fact is, most often marketing researchers are highly interested in narrowly defined populations and those who are like them in some measure. The concept is self-evident: Investors in mutual funds are highly likely to invest in other securities. A random selection from large database of mutual fund investors is a more relevant and useful representation of that market.
Selecting a sample source
There are only three criteria for selecting the source of a listed database sample:
1. Credibility and integrity of the firm and its methods.
- How frequently do they update files?
- Are they thorough and vigilant in seeking new databases?
- Can they merge databases efficiently?
2. Credibility and integrity of the individual pulling the file.
3. Credibility and integrity of the original source of the database.
- Caveat emptor: be thorough and ask lots of questions (the tone and consistency of responses will also help in answering #1 and #2).
As always, feel free to discuss these issues with your colleagues. AAPOR.net (a service to the members of the American Association for Public Opinion Research), for example, is a collegial roundtable with many experienced and helpful individuals. Ask for their opinions and experiences. You’ll get it. Advice can also be obtained by posting questions on research bulletin boards at Web sites such as www.quirks.com or www.worldopinion.com.
Cost is not usually a factor. Listed database files are often twice as expensive as RDD samples. But those costs are negligible compared to the savings in the field. Consider that 15,000 RDD numbers costs approximately $1,500, compared to 10,000 records of listed database sample at a cost of $2,500. Further consider the efficiency of completion indicated above, and you have some idea of the relative cost consequences for each sampling method. The most significant difference stems from the fact that you need to make far fewer phone calls.
All marketing researchers should think critically about the appropriateness of methods for sampling each and every study. Sometimes a highly targeted sample will be in the best interest of the project. Sometimes not. You simply have to address the task at hand and select the best method, all things considered, to suit your needs.