Editor’s note: Robert L. Zimmermann is senior research manager for design and analysis at Maritz Marketing Research, Inc., Minneapolis division, a company he has been with for four years. He is currently a clinical assistant professor of psychiatry at the University of Minnesota in which he is a statistical consultant to grants in the areas of addiction and eating disorders. Zimmermann has taught at the University of Winnipeg and held research positions at George Washington University and the University of Minnesota. He holds an M.A and Ph.D. in psychology from the University of Minnesota and has published over 60 articles in psychiatric, educational and marketing research.
This article focuses on some of the problems inherent in using a test market to assess the impact of some change in marketing. By test market I refer to a procedure whereby some subset of the real market is altered to obtain an understanding of the impact on volume, market share or profit. In the case of a product so new and unique that a market cannot be said to exist, then a test market would involve establishing such a market on a limited basis.
I am contrasting the use of test markets with a range of procedures which I call analytical research tools. These include perceptual mapping, conjoint analysis, segmentation studies, concept tests and in-home trials. The methods tend to be more complex in research design and involve more complex data analysis procedures. The defining characteristic as the terms are used here, however, is that analytical research tools use hypothetical models to gain insight into real market forces whereas test markets take place within a slice of the real market. Many, possibly all, substantive issues can be addressed with both approaches, e.g., price sensitivity can be studied using conjoint analysis or real market manipulations.
Quality, quantity
There are two kinds of issues involved in assessing the usefulness of test markets: issues involving quality and quantity of the information obtained. The primary issue with regard to quality is predictive ability: Will the results obtained from the research study generalize to the market as a whole? Quantity is an issue of efficiency.
Test markets possess what a measurement expert would call content validity. This means they are comprised of a sampling or subset of the actual phenomena to be assessed. It needs to be stressed that content validity does not guarantee predictive validity.
Predictive validity requires an adequate sampling as well as control or evaluation of the effects of all non-random variables which might affect the criterion measure.
Corollary examples
The following are some corollary examples from other areas of applied science to illustrate the difference between predictive validity and content validity.
Suppose one is confronted with the training of a large number of people in skills for which they have had little or no relevant experience, for example, the U.S. armed services at the beginning of the first and second world wars. The most certain method of determining if people would succeed in a specialized training program is simply to put them in the program. But this method is very costly. Many would fail the program, yet for those who succeed, one does not know if this is their optimally successful area. Two or three days of initial testing markedly reduces the risk of faulty placement. Few would argue that a test is either more real or more valid at predicting future success than an actual trial in the training program. It is simply much more efficient.
In medicine, certain bacterial infections are differentially sensitive to different antibiotics. One way of determining the appropriate antibiotic is to try each in succession in the patient. This might take one or two weeks per antibiotic, but if the patient lived long enough and if there were no other complicating factors to cloud one's evaluation, one would eventually determine the appropriate medication. One can grow a culture of the bacteria in an artificial media and test all the potential medications within a few days. The question is not which is more real but which is more efficient.
Possible problems
There are at least four areas in which problems may arise in implementing a test market: market selection, test implementation, uncontrollable extraneous factors and test assessment. The selection of a market or markets in which to make the test should aim at putting together a microcosm that matches the larger market. The two should be functionally equivalent on all variables that might affect the criterion measure. Note that equivalence includes not only geodemographics but temporal cycling, marketing patterns and such adventitious factors as droughts, severe winters, factory closings and anything else that might affect response to the test market.
Unfortunately, considerations of statistical adequacy are often forced to give way to other considerations. Markets have natural boundaries defined by distribution channels and other factors. In addition, it may be easier to implement changes in some markets than in others.
More mischievous is when the market choice is based in part on considerations which might clearly bias the data. This happens most frequently in new product introductions where there is the strong tendency to introduce a product in the area(s) where it is expected to have the best chance of doing well. At worst, a test market can become a self-fulfilling prophecy designed to bolster the position of some faction within a company.
Multiple conditions
Market selection is compounded if multiple conditions are being evaluated because then the separate markets must be matched on all factors which might affect response to the test changes. These may be unknown or difficult to evaluate on a market by market basis. It is virtually impossible to statistically equate markets on an ex post facto basis without data on individual consumers.
Next is the problem of implementation. At the most basic level, market researchers may have only indirect control overprice and other factors influencing the sale of their product. Wholesalers and retailers may not be wholly responsive to the changes. There may be parochial price competitions, volume discounts, differential use of coupons, competition for shelf space (or simply inefficient or inaccurate stocking of shelves).
Distribution patterns
There may be several distribution patterns at the wholesale and retail level with the boundaries overlapping and only vaguely defined. These affect the geographical and temporal precision associated with the placement of a test market as well as the data that are used to evaluate the test market. Thus, it is often difficult to describe precisely the limits or nature of the market being tested, much less guarantee its representativeness. If market researchers have limited control over their price, they virtually have no control over the policies of competition and in fact, may not even have a clear idea of what those policies are. Competition may or may not keep their pricing structure or other marketing policies constant across the span of a test market and any changes may be different for each competitor and for each market.
In some kinds of markets, e.g., restaurant chains, the major competition may not only be unknown but almost unknowable, varying by type of occasion, dependent on price and location and frequently consisting of a changing group of privately owned establishments each impacting only one or two franchises.
Data often inadequate
Assuming it is possible to maintain adequate control to the question at issue in a test market, the data produced by a test market are often inadequate. Overlapping distribution patterns, variations in the amount of product stored at levels in the distribution system, variations in the speed with which goods move through the distribution system and variability in the efficacy of exerting control over the test factors produce a lack of precision in the data accumulated in a test market.
Even where something approaching an audit trial can be maintained or where data are available at the individual consumer level for a reasonably intact target market, there remains a mushiness to the data. This is because in most real market contexts you cannot directly impinge on an individual consumer in a manner adequate to permit documentation of the timing and degree of awareness of the test changes. (Granted that when you are testing promotions the evaluations of this factor may be precisely the objective of the test market).
Inefficient way
Even given that the data obtained from test markets have an inherent validity to them and that the necessary controls can be invoked to assure that data are unbiased, test markets still remain a very inefficient way to do market research.
Test markets are very expensive and cumbersome. It usually takes a long time and a lot of money to execute them and even so it is usually only practical to test a very limited set of parameters. It is absolutely impractical to test all the relevant parameters within a test market context. Therefore, before a test market is undertaken, it is essential that preliminary analytic research be performed to determine the optimal parameters to be tested.
Test markets are also by definition intrusive procedures. Their costs are not always limited to the costs of implementing and evaluating the test effects. An inappropriate manipulation may have an extended effect on at least the test market.
Analytic research tools are unquestionably more efficient both in terms of cost and turnaround time. A much wider array of information is obtainable, usually quickly enough to permit and cheaply enough to come back again with additional questions raised by the research itself. Some of these procedures permit simulation, extending the applicability of the results beyond the constraints of the test conditions. This produces a richness of data not obtainable in a test market context.
The drawback
The drawback of analytical research tools, of course, is that most of these procedures contain an unspecified amount of systematic bias and error of measurement. This is especially worrisome with respect to parametric estimates of market share and/or sales volume.
Note specifically, however, I do not list as one of the drawbacks that analytical studies are somehow less real than test markets. It is certainly true that they are all less realistic but that should be an irrelevant issue. The real issue should be what is the most cost effective and reliable predictor of market behavior. It is only when we cannot provide any convincing basis for answering this latter question in favor of analytic procedures that we are forced to fall back in default on test markets. Conversely, it is very difficult to establish the predictive validity of test markets because they are both so cumbersome and so unique that it is difficult to build up a statistically reliable base of predictive test markets. Established, homogeneous, steady state market contexts may be exceptions to this.
Amount and accuracy
In summary, the two questions at issue in comparing test marketing and analytical procedures as methods for answering questions or gaining insight regarding marketing directions are those of efficiency and predictive validity, that is, the amount of information and the accuracy of the information. I am convinced that analytical procedures are by far the most efficient at gaining knowledge of the way the market will perform under a variety of potential manipulations.
Predictive ability is, on the other hand, a moot issue. There is not anything inherent in a test market that guarantees it to have superior predictive ability, except that as the test size is increased in both geographic and temporal extent, the question eventually becomes irrelevant. Nevertheless, test markets will probably remain the court of last resort before full implementation of a critical marketing change.
Discrete parameters
Since a test market can test only a specific set of issues, it should be undertaken only when the options have been reduced to very discrete parameters. This can often be done most efficiently through the proper use of analytic procedures. Finally, when a test market is conducted, resist attributing magical properties to the results. If the information obtained from a test market disagrees markedly with analytic research studies, question both methodologies until you can reconcile the differences. There are as many factors which can impinge on the validity of a test market as can invalidate an analytical research test.