Pre-post vs. post-only studies
Editor’s note: Don Bruzzone is president of Bruzzone Research Company, Alameda, Calif.
When you want to measure the effect advertising has had, is it better to conduct before-and-after surveys, or can a single survey conducted after the advertising do the job? That is not an easy call. There are arguments on both sides.
There has long been a pervasive feeling among researchers that to show if advertising changed things, you have to know how those things were before the advertising appeared. That applies to virtually every measure you might use: awareness of the product, perceptions of the product, interest in buying the product, etc. But there is one exception: “lift.” It can be the lift in awareness, or the lift in buying interest, that is associated with noticing the advertising.
Use of lift is growing
The growing use of recognition-based tracking has made the measurement of lift more feasible. Recognition is the most accurate way of splitting a sample into those who noticed the advertising and those who didn’t.
When telephone surveys are used for ad tracking the typical question is, “Do you recall seeing any advertising for Brand X in the past month?” When a respondent says yes, they may be thinking of advertising Brand X conducted six months, or a year ago. Or, they might even be thinking of a competitor’s advertising. But when you show the advertising to them and ask, “Do you remember seeing THIS before?” you get a massive increase in accuracy. Then you can accurately split the sample into those who noticed the advertising, and those who either ignored it or never had a chance to see it. That is the key to determining if there is anything significantly different about those who were reached by the advertising. And, that in turn, is the key to measuring lift.
Two of these measures are the most common. They measure increased awareness and increased buying interest among those reached by the advertising. For the first, you typically ask respondents which name comes to mind first when they think of the products in that category. For the second, respondents can be asked to assume they are in the market for the product today. How likely would they be to buy each of a number of products in that category? Both need to be asked before any advertising is shown for that category.
Both measures are essential. They show if overall improvements in awareness and buying interest were related to noticing the advertising. They provide solid evidence of a link between the improvement and the advertising.
They also have a classic limitation: Many researchers feel those who already think highly of a product or brand are more likely to notice its advertising. So higher levels of awareness and buying interest can be found among those who recognize advertising - without any real change having taken place among those who recognize the advertising.
There are two basic ways of controlling for this: the pre-post approach where two surveys are conducted, and the post-only approach where a careful evaluation of more circumstantial evidence is needed.
The pre-post approach
Here, the same people are interviewed before and after the advertising. This shows how many individuals did not have top-of-mind awareness before the advertising, but did after noticing the advertising. Similarly, it would show how many were not interested in buying before, but were after. In short, this approach shows how often actual changes took place in an individual’s awareness and buying interest.
But, like experiments in any field, you need to guard against the research itself having some inadvertent effect on the results. In pre-post ad tracking the most common effect is that the first interview sensitizes some people and makes them more likely to notice subsequent advertising. In the post-advertising wave they are more likely to recognize the advertising, and if the advertising is effective, the number affected by the advertising would be higher than among the public at large.
The solution is the same as in most experimental research: the use of both test and control groups. Those interviewed before are the test group, and a matched group, only interviewed once after the advertising, is the control group. The control group shows if the number reached and affected was overstated by the results from the reinterviewed test group. It also provides the most comparable base for measuring any overall changes in the marketplace since the advertising started. But again, the test group is the group where we learn if individuals actually change after being reached by advertising.
The pre-post approach is the most expensive, but it provides the most conclusive results.
The post-only approach
Even though pre-post tests are best, a great deal can still be learned about the effectiveness of advertising from a single wave of interviewing after the advertising appeared. With today’s pressure to come up with usable answers at the lowest possible cost the question is, can the additional cost of before-and-after surveys be justified?
Our firm has accumulated several types of key evidence during the 28 years we have been conducting recognition-based ad tracking. The evidence shows the number already favorably predisposed is not the main driver of recognition or any other measure of the number reached and affected. This means, to a large extent, that the actual differences in the impact of advertising can also be seen in post-only surveys. The winners and losers are likely to be the same as in before-and-after surveys.
First, for the sake of argument, say the number that like the brand does have a strong effect on the number that will notice its commercial. Then, we would expect to find a strong relationship between market share and recognition. We don’t find that.
- The percent that buy the product being advertised has no significant relationship to the percent that recognize a commercial for that product. As an example, for the 58 Super Bowl commercials aired during the 2004 game, differences in the number using the product being advertised accounted for less than 2 percent of the differences in recognition. None of our other measures of the number that were both reached and affected showed a closer relationship. Any effect that market share has is more than offset by differences in other factors, primarily the quality of the execution.
This shows that for most products, most of the time, the problem can be ignored and differences in lift can be taken at face value. A large amount of lift means an effective commercial, a small amount means a less effective commercial. However, there is no guarantee this applies to the advertising for every product. There could still be categories where recognition is related to use.
- People notice lots of advertising that has no effect on them. So, reaching more people is no guarantee in and of itself that more people are going to be affected. But this means that in categories where buying is related to recognition, the actual impact of the advertising can still be seen as long as factors that cause that impact are stronger than the effect the other factors have on recognition.
- Our syndicated post-testing of all Super Bowl commercials provides solid evidence that differences in ad quality cause massive differences in impact. Over the years we have collected 119 cases where a commercial was only aired once on the Super Bowl and never again until we had finished our interviewing. So they all had essentially identical exposure, and it was under perfectly normal conditions - an unusual opportunity for research. Yet we found the top 20 percent had been noticed by four times as many as the bottom 20 percent. And more to the issue raised above, the number showing signs of having been both reached and affected was eight times greater among the top 20 percent.
The factors that cause those differences in overall impact are stronger than the factors that cause differences in recognition. They will stand out, even in cases where recognition has been inflated by a high market share.
- Further, insofar as being favorably predisposed to a product causes any increase in the likelihood of noticing its commercials, the amount by which it increases recognition is likely to be somewhat constant. It won’t cause people to notice commercials they have never been exposed to. A case can be made that, if you like Product A enough to buy it, the amount by which that increases recognition of Product A commercials is likely to be the same as the amount by which recognition of Product B’s commercials is increased, if you like Product B enough to buy it.
If that is true, awareness and buying interest should always tend to be higher among those who recognized commercials than among those who didn’t. The average amount of difference could be taken as the “normal” amount of overstatement that is found in post-only measures of lift. Then, greater than average differences could be considered “real” increases because they are greater than the amount that can be accounted for by the normal level of overstatement.
Although there is some logic to this position, there is also some data that suggests it is an oversimplification. For Super Bowl commercials, 16 percent of the time the lift in awareness was not a lift. It was negative. Unaided first-name awareness was lower among those who noticed the commercial. For the lift in buying interest the proportion of negative numbers was even higher: 28 percent. In both cases most of the negative numbers were small - too small to be statistically significant. But they occurred frequently enough to cast doubt on any assumption that results will always be more favorable among those who recognize the advertising. As such, it is further evidence that results from post-only studies can be relied on to be at least directionally correct. It appears any biasing effect from buyers being more likely to notice advertising is too small and too intermittent to have a major effect on post-only results.
Evidence on cause and effect vs. correlations
Results based on pre-post tests are more conclusive because they are based on evidence that shows cause and effect. You survey the same group before and after, and hopefully the only thing that is different for every member of the group is that they have seen the advertising. If they show higher levels of awareness and buying interest than a similar group that had not seen the advertising it is reasonable to conclude the advertising “caused” the increases.
Results based on a post-only study are like the evidence courtroom dramas refer to as circumstantial: hopefully indicative but less than conclusive. Conclusions about effectiveness rely on the fact that more favorable results are related to, or correlate with, noticing the advertising. It is not the same as the evidence in pre-post studies that shows the favorable results were caused by the advertising. One reason correlational evidence is less conclusive is you don’t have any direct evidence showing what is the cause and what is the effect. They could be the reverse of what you might assume. (Like more favorable attitudes causing people to notice the ad.) There could also be some unknown third factor that accounts for changes in both. (Consider that old saw from Statistics 101: The number of churches in town is correlated closely to the number of bars. Do more churches cause more bars, or do more bars cause more churches? Neither. Population is the missing factor. Larger towns have more churches and more bars.)
Does that mean circumstantial evidence should be ignored? Certainly not. Often it is the only evidence that is available. Juries convict people every day based on nothing but circumstantial evidence. But it does mean the evidence needs to be evaluated more carefully - something this recap of pros and cons has hopefully contributed to.