New tests for new times
Editor's note: Jeri Smith is president and CEO of Communicus Inc., a Tucson, Ariz., research firm.
Advertisers spend millions of dollars copy testing ad executions before in-market launch, often testing in rough stages to avoid producing a commercial that could potentially be a weak performer. This research investment is widely regarded as prudent, to ensure that the investment isn’t wasted on advertising that may not perform. However, for the copy testing investment to be sound, it has to result in good decisions – which means that the predictive metrics that are provided must be solid.
Unfortunately, the underlying principles upon which the major copy testing systems are built are no longer suited to today’s advertising environment. This mismatch between how advertising works and what copy testing measures has resulted in predictive margins of error so wide as to be nearly unacceptable. Copy testing can identify some of the losers but it gives far too many ill-performing ads a passing grade, resulting in millions, even billions, in wasted ad dollars.
Advertisers who find that their in-market results do not reflect what they were led to expect are often left wondering why. Of course, what was tested is not always what was run but the changes made between the copy test and the in-market launch were designed for optimization. What, then, can explain weak in-market performance for ads that copy testing said would succeed?
Some say that there’s too much else going on in-market to be able to connect predictions with results. As an industry, we need to be better than that – that’s an excuse and not becoming to the credible researchers that we are.
Media consumption habits have changed
There are a number of reasons why copy testing doesn’t work as well as we hope it will. Some reasons are related to how the advertising environment and consumers’ media consumption habits have changed since copy testing was originally conceived and developed. Other reasons are based on how the brain processes and stores advertising engagements; these realities have always been present but have become more important to get right as the clutter of brand messages has exploded in recent years.
When copy testing was first conceived, the focus was on the TV commercial. In earlier days, the typical advertiser produced a single commercial that represented the brand’s total ad campaign. Sometimes that ad would get pooled out but typically this occurred later, after the initial execution had been run for a significant period of time. As such, it was entirely appropriate that the focus of testing was on the single commercial, how consumers received it and how it might affect the brand. (Figure 1)
If you start from the premise that the TV commercial is only one of a number of elements within your brand communications arsenal – and might not even be the first or primary one to which consumers are exposed – you might develop a very different testing approach.
Copy testing was also conceived when consumers sat on the couch and watched TV as a primary leisure activity. Of course, they didn’t watch every commercial. And sometimes they had a bathroom or snack break or maybe they didn’t leave the couch. Or maybe they just stopped paying attention when what was on screen no longer interested them. But they weren’t flipping channels, they weren’t watching pre-recorded shows and fast-forwarding through the commercial pods and they weren’t engaged with a second screen while ostensibly watching TV.
If your copy testing system starts from the premise that consumers aren’t even looking at the screen, the way you test – and the commercials that pass the test – might be quite different than in today’s forced-exposure copy testing world.
In other words, the world of advertising, and the world in which we advertise, has changed.
But now we get to what hasn’t really changed, or at least not that much. Advertisers have always known, on some level, that what consumers can play back immediately after a forced exposure to a commercial may not be entirely predictive of what they will actually get out of seeing the commercial in a more real-life setting. In fact, early approaches that involved in-program commercial exposures and 24-hour (or longer) callbacks were designed to address this phenomenon. Later, clutter reels replaced the in-program exposures and the re-contact methodology was scrapped largely for reasons of cost and timing. Now, forced exposures, with or without the clutter reel, are followed more or less immediately by questions and algorithms that we have been assured actually predict how ads will work in-market.
But common sense, not to mention actual data, suggests otherwise. In fact, there are essentially three very different brain functions that come into play as people engage with and process advertising (Figure 2). In his book Thinking Fast and Slow, Daniel Kahneman contrasts the difference between how we experience an event in the moment and how we remember the experience afterwards. In-the-moment experiences are typically only held in the brain for about three seconds before being replaced by memories – in which the brain remembers only some aspects of what has occurred. In earlier years, researchers who intuitively understood that the actual experience of an ad is quickly replaced by the memory developed approaches to capture the experience – dials that consumers could turn to register their liking or disliking, for example. More recently, the industry has embraced neuroscience and so-called “implicit” methodologies – studying brain waves and facial expressions to provide insights into the in-the-moment experiences of ads.
But as Kahneman tells us, the experience itself isn’t what counts. What counts is what we remember. So while moment-by-moment analyses of commercials or other brand communications might be useful from a diagnostic standpoint, the insights gained cannot be relied upon to help us understand what consumers will take away from a particular ad – how it will be remembered (or not) and later acted upon (or not). Consequently, claims that the in-the-moment measures are in any way predictive of in-market success are highly suspect.
Our focus moves away
The more we move in the direction of reading how consumers are reacting as they watch commercials, whether using brain wave analysis, facial coding or other “implicit” methods, the further our focus moves away from what really counts – what consumers remember after the experience is over. Again, experiences and the memories of those experiences are different and in the case of advertising, what counts is the memory, not the experience.
This brings us to the third type of brain function: long-term memory. What a consumer remembers right after seeing an ad is held in the short-term memory. What was the ad about? Was it likeable? What brand was it for? What did it tell you about the brand? This is the main type of feedback that we receive about ads in today’s copy testing environment. But short-term memories rarely come into play when consumers are making brand decisions. What comes into play – if you’re lucky – is long-term memory.
Unfortunately for copy testers and those who rely on them, long-term memory is not simply an extension of short-term memory. Long-term memories are not even just slightly faded versions of short-term memories. Rather, they are often fundamentally different, as they are a blend of what was already in the consumer’s mind, what other associations have occurred since the initial event and how the memory is triggered later. Long-term memories, in other words, have far more to do with personal beliefs and experiences – and more to do with all of the interactions that the consumer has with the brand and the category – than they do with any single experienced or remembered commercial engagement.
Predict and diagnose
When it comes to how the brain processes and acts upon advertising, copy testing focuses primarily on the experience itself and the short-term memory of the experience. How advertising actually works in-market involves mostly long-term memories. The primary challenge for copy testing when it comes to memory structure is not to get ever more deeply into the experience (other than to gain “diagnostic” feedback), rather to determine ways to predict and diagnose how ads and campaigns will shape long-term memories.
If you were to focus on building an understanding of how advertising builds the longer-term memories that shape brand perceptions, affinity and purchasing, the approach that you developed would likely be quite different than what we’re using today.
So the fundamental shortcomings of today’s copy testing systems revolve around: 1) what’s tested (ads vs. campaigns); 2) a fundamental disconnect between how we test and how consumers engage with ads (or don’t); and 3) how ad engagements are processed and stored.
Copy tests generally focus on three key performance metrics, all of which are affected by the dynamics described:
Breakthrough/attention/engagement: Because consumers don’t behave as they used to, these metrics are less predictive than they should be. Further, we know that campaigns engage differently than single ads, creating more havoc when it comes to predicting engagement, particularly for new, undeveloped campaigns.
Branding: Subject to the same issues as breakthrough but also seriously impacted by the differences between short- and long-term memory structures and composition.
Claimed persuasion on the brand: Campaigns persuade on the basis of long-term memory. Monadic ad tests, along with a reliance on immediate response, translate into persuasion metrics that, more often than not, are not replicated in market.
That brings us to the diagnostics. If the diagnostics are designed to improve performance on predictive metrics that are not predictive, we end up improving for the copy test, not for the real world. If you religiously follow all of the diagnostic feedback that you receive, you’ll end up with an ad that does better in a re-test scenario. But if the copy test metrics never actually point in the right direction on the basis of what’s actually going to happen in-market, the time and dollars spent changing your ad – and making it better fit the formulas of how advertising is supposed to work – have not necessarily even improved it.
Stop tinkering around the edges
As an industry, it’s time to stop tinkering around the edges of our approaches to copy testing and to recognize that the emperor has no clothes – or is at best quite skimpily clad. While there are certain aspects of today’s copy testing systems that still provide some value, it’s important to recognize what copy testing can do and what it can’t. Until we have an open discussion about the elephant in the room, we will continue to lose constituents – advertisers who experience for themselves the failures of copy testing and who, as a result, throw the baby out with the bathwater.
We must acknowledge that our current copy testing systems aren’t very good at predicting and that much of what passes as “diagnostics” is either pushing us into more formulaic and bad advertising or, when it comes to some of the newer implicit measurement approaches, simply producing in-the-moment mumbo jumbo that isn’t particularly helpful.
Advertisers spend a lot of money producing and running ads. They have been asking us to help them determine which ads to run and how to make those ads better. Not only should we be able to provide better answers, we should be able to provide insights to help them to develop and refine campaigns – all elements working together in-market, over time to connect with consumers and build brand affinity, sales and ROI.
The challenge is to:
- Find ways to better predict whether ads will disrupt and engage in today’s cluttered cross-channel environment.
- Test campaigns, not individual ads. While consumers see ads one at a time, they form lasting impressions on the basis of campaigns; and not all campaigns are TV-centric anymore – nor should they be.
- Develop research approaches that emphasize long-term memory; use experiential, in-the-moment approaches only for the types of diagnostic insights that can help us understand how to make our ads more effective.
- Incorporate behavioral measures – connecting ads with purchasing and sales data – but with caution, recognizing that advertising must not only produce sales but also needs to build brands’ top-of-funnel dynamics.
- Create better feedback loops so we can be held accountable for our predictions and for the value provided by our diagnostics. Work on providing answers to the following: Do our predictions get borne out? If not, how can we adjust our systems to make them better? And does the diagnostic analysis that we provide actually help to improve the ads and campaigns?
Better, more relevant advertising
Ultimately, we should hold ourselves accountable not just for making copy testing better but also for helping our marketing partners in the creation of better, more relevant advertising. I am proud to be an advertising researcher but will be prouder of our industry when the advertising campaigns that we help to shape are stronger – more interesting, more engaging and more effective in building the brands that they support.