How much is too much?

Editor’s note: Beth McKeon is vice president, Design ForumResearch, Dayton, Ohio.

There’s a big hole out there in the body of retail research data regarding one key element of in-store signage: that of saturation. While a significant amount of research efforts have focused on what constitutes an effective sign and where to put it, little is known about signage quantity: how much is too much, too little, or just right for maximum communication effectiveness. This article details a research method, signage saturation analysis (SSA) that allows a retailer to systematically assess the current level of signage saturation and its effect on customers’ identification and interpretation of key messages. Also covered are key deliverables, such as what happens if signage saturation is altered, how to maximize the effectiveness of a signage program in terms of signage saturation, and how these key variables compare with those of the retailer’s key competitors.

Central to the issue of retail signage saturation is the common practice of opportunistic signing, typical in many big-box retail settings. Simply put, when the philosophy “If you want to tell the customers something, add another sign,” is frequently adopted, the result can be an overwhelming array of messages for the customer to digest. In fact, such a retail environment can reach the point of diminishing returns in which there are so many messages that the customers actually recall fewer and fewer of them. While each sign could have been based on sound design and placement principles, the sum of all of them could be counterproductive. The key questions for retailers in this situation might be: “How do I measure the possibility that I’m at that point of diminishing returns?” and “How can I get to the point of optimal customer identification and interpretation of my existing or proposed signage?” SSA is a method designed to provide quantitative answers to such questions.

SSA is based on the ability of customers to recall messages seen in wide-angle views of the actual store environment as well as views in which successive layers of signage have been peeled away. Of course, two assumptions, both with substantial face validity, are made. First, there is an inverted U-shaped relationship between number of signs and number of messages that can be recalled by a customer. Second, customer recall is one key element, certainly among many, of signage effectiveness.

At this point, we should make certain distinctions between eye tracking and SSA as applied to the present research problem. Eye tracking has been widely used to test, with relatively small samples (N<20) where a customer’s "point-of-gaze" is at various phases of a point-of-purchase experience, with particular emphasis on packaging, price, promotion or institutional message signs. As such, eye tracking is useful for a more "micro" examination of the effectiveness of a sign or package. It is not particularly useful, however, when larger samples sizes are needed to test the effect of too many signs especially if many scenarios are to be tested. In other words, eye tracking yields many data points for each subject exposed to one or few retail scenarios (Marshall et al., 2000; and Young, 1999). SSA yields very few data points for each subject but facilitates the testing of many scenarios for much larger sample sizes.

SSA methodology

Before considering the use of SSA, one should verify that the retail environment indeed has a signage quantity in various parts of the store that would be considered excessive. The best candidates are those with multiple sets of directional, promotional, institutional and brand-related signage as well as evidence of opportunistic signage (excessive placement of handwritten or time-sensitive signs in front of other more permanent signs). For this reason, most SSA analyses will be done in large-box retailers, not small boutique-type specialty shops. There are four basic steps to the SSA method:

  • Digital photography of retail environments. Because digital photographs will be the basis for the visual stimuli shown to respondents, the idea is to take photos in relevant store departments each with the maximum the number of legible signs. We recommend at least 10 photos for a newer generation of the client’s store, 10 of an older generation and 10 in a competitor’s new store. The older store version allows testing of the hypothesis that signage effectiveness changes over time. Digital resolution should be 2.1 mega pixels or higher. One will have little problem taking such photos for a client but photos of its competitors are another story (though not illegal or even unethical, the reception is not very warm for the latter!). Each photo should contain multiple category and numerous directional, promotional, institutional and brand-related signs. Figure 1 shows an appropriate example of one such photo for a fictitious retailer. These views generally come from main aisles looking at multiple end-caps and include overhead directional signage. Once photos have been obtained, they should be screened down to the best three of each store. In this case, “best” applies to those with the most messages and, in the case of a tie, those that are most alike between stores as these will facilitate more valid comparisons. Next, these nine photos (three from each of three stores) must be graphically enhanced by someone with graphics software expertise (with, for example, Adobe Photoshop).
  • Graphic alterations to each photograph. This, in our view, is the most important phase. This enhancement creates three to seven versions of the original photo by peeling away layers of signage a little at a time. In essence, the graphic artist starts with the original photo and creates new layers by removing certain sets of signs at a time. A graphics professional skilled in the use of a photo editing software package knows best how to remove each sign. This is because each message removed must be replaced by what would be behind it in the real store. Graphic artists use their own artistic skill as well as the features of the software to create a new layer that actually looks indistinguishable from an actual photograph. In essence, each layer must look real.


Figure 1

Layering or peeling is where art and science must be applied together very carefully as the call for exactly what gets peeled with each layer can be rather subjective. From the “as is” photo, we often begin first by removing clutter on the floor (those items that would not be there at the store’s grand opening). This is to test the hypothesis that removal of floor clutter improves overall signage effectiveness. Our recommendation for peeling each subsequent layer is to remove a set of like signs that a store manager would think he/she could most do without. This could mean removal of the signs of a certain promotion plus some selected handwritten signs that appear excessive. This is continued for the next layer (i.e., photo) until the last layer is one that appears to be too “stark” in terms of communications, resulting in as few as three and as many as seven photos. This, then, creates a menu of photos that go from the “as is” scenario with the suspicion of too many messages, to the stark photo, with probably too few. This maximizes the chance of finding an optimal point of signage level if it, in fact, exists. Figure 2 is actually Figure 1 with clutter and some redundant signage removed.

This process is done for each of the nine photos. We recommend creating the same number of layers in each photo as this will facilitate movie preparation and counterbalancing, which are covered in the next step. Even if this requires less peeling per layer for one more than another photo, findings will not be confounded because layers are not directly compared to each other.

  • Preparation of movies. Once the photos are prepared (i.e., layers of each are completed), they must be arranged into a “movie” - what we show respondents. The movie is a series of seven to 10 photographs shown to respondents one at a time with the purpose of testing their ability to recall messages seen. There are some key considerations for the preparation of each movie.

First, the use of a versatile movie software package (i.e., MacroMedia Director or Microsoft PowerPoint) facilitates easy import of photos and modification of timing and/or sequencing. Since they will be shown via LCD projector to one group of approximately 10 people at a time, resolution is everything. Therefore, any graphics enhancements (including re-sizing) must be appear clear and crisp when projected on a large screen.

Second, because a respondent will only see one layer of any photo in a movie, multiple movies should be developed to counterbalance photo sequence and level of peel. Each new group of 10 respondents, then, sees a different movie. Each movie should have one layer per original photo with different layers of the same original between movies. For example, in the case of nine photos and five layers of each, five movies of nine views each allow for all 45 images to be seen in counterbalanced fashion so each gets the same exposure with no bias. If 20 groups of 10 each are tested, then 40 respondents will see each movie, and, thus, each view.

Third, each movie simply consists of the nine or so views arranged in the order specified from above, with each shown for seven seconds. Seven seconds is the average period of time during which, based on our video ethnographic retail studies, most customers make evaluative judgments as to how effective store signage will be - beyond which most resort to asking for help due to ineffective signage. After viewing each, respondents are asked to recall what messages were seen by the use of a checkmark on an alphabetized list of possible responses including at least 60 percent bogus responses (those that are not seen in that view). We prefer this quick ID method to that of writing down messages seen, as the latter measures more constructs than simple memory recall. Incumbent in this phase is the preparation of accurate checklists from which respondents can select their “correct” responses as well as an instruction sheet that the moderator can read from for every group (thus creating a standardization between groups).

  • Administering the movies to respondents. This is a fairly simple procedure compared to those previously mentioned. Here, we recommend a focus group or mall research facility to accommodate the 50-100 respondents needed each night, in groups of 10 at a time, generally 30 minutes apart. While recruitment and compensation under such circumstances is certainly more expensive than simple store intercepts, we believe strongly that the sample should be most representative of the relevant customer base and the facility must provide for consistent viewing resolution, sound and lighting for each group. The moderator reads from an instruction sheet that walks respondents through the steps outlined above, showing each movie via a laptop and LCD projector.

Deliverables/key findings

Here are some hypothetical examples (i.e., the numbers are purely fictitious) of findings that could result from SSA:

  • “As currently configured, Retailer X indicates 18 percent less recall of critical elements than its competitor, Retailer Y.”
  • “Removal of floor clutter alone improves overall recall by 25 percent.”
  • “The newer version of Retailer X’s store has a signage level with 23 percent more correctly recalled elements than its older version.”
  • “At the third layer of peel, Retailer X exhibits 10 percent higher overall recall than any point of peel for its competitor, Retailer Y.”
  • “If one-third of the ABC Promotion signs are eliminated in Retailer X, recall of that promotion does not change but overall recall of other messages increases by 20 percent.”
  • “Overall reduction of signage levels by 30 percent (perhaps two layers of peel) will yield an increase in overall recall by 18 percent.”

We certainly do not purport that SSA provides the answer to every signage question. Rather, we suggest that SSA is a tool by which a retailer can learn more about the impact of signage quantity (or saturation) in the store and what to do about it. Furthermore, the computer-aided graphics component of SSA allows for new signs to be tested digitally without having to actually construct and install them. In fact, any combination of sign position, size, color, etc., can be evaluated in terms of customer recall using SSA. This means that SSA could be adapted to test scenarios in which signage quantity is suspected to be inadequate as long as digital capabilities exist to add new signs. We even see applications of the SSA methodology to new package design and vendor point-of-purchase displays.

As with any good research methodology, signage saturation analysis provides focused insight to specific research questions but does not solve every signage problem. Rather, it is a systematic way of quantifying signage saturation, effectiveness, and clarity for existing, prototype and competitor retail environments. It provides the flexibility to test many signage scenarios with large sample sizes and can yield specific recommendations for how signs can be removed or added to optimize effectiveness. In short, it is a new component in the retail researcher’s toolkit, one with “teeth and traction.”

References

Marshall, Sandra, Tim Drapeau, and Maritza DiSciullo, “Eye-tracking helps fine-tune AT&T’s customer service site,” Quirk’s Marketing Research Review, July/August 2001. [Go to www.quirks.com and enter Article QuickLink number 703.]

Young, Scott, “A Designer’s Guide to Consumer Research,” Design Management Institute Journal, Spring 1999, Vol. 10, No. 2.