Editor’s note: Gary M. Mullet is president of Gary Mullet Associates, a Lawrenceville, Ga., data analysis and consulting firm.
As this is being written, best/worst lists are popping up - sometimes literally - in various media. Best movies, worst dressed, most expensive cities in which to live, and on and on. One advantage of the onset of my dotage (very rapid onset according to my children) is that it’s becoming socially acceptable for me to be an old curmudgeon. With that in mind and tongue only sometimes in cheek, I’ve compiled a list of, for want of a better term, research atrocities (all disguised). The faithful readers of this column will recognize an occasional reprise of a couple of my earlier favorites, but most will be new and, I hope, informative. And there may be some, mostly inadvertent, overlap between categories. And oh yeah, as Dave Barry is wont to write, “I’m not making this up.”
Respondent abuse
Well, the wording may be a little strong, but there are several instances of making the respondent’s task extremely difficult. Sure, we want every research project to answer every possible question posed by everyone involved. And in some big organizations there are lots and lots of folks, from lots of departments, involved. Without further ado, here are a couple of my favorites:
- In a choice exercise, respondents were to pick between six or seven products, with the ever-present option of none, at each step. The products included descriptions on a half-dozen or so attributes, each of which varied over several levels (such things as color, size, etc.). What made this interview so tough for respondents was that they were expected to do this choosing 81 times each. Is it any wonder that the results didn’t make a lot of sense?
- A conjoint exercise was conducted where respondents were asked to rate each scenario independently of others, i.e., this was not a choice exercise. What made this one so difficult was that each scenario included one level of each of 33 different attributes. Again, there were more than a few anomalies in the results.
- In order to save a buck or two, a client designed and fielded a conjoint study. Only when it came time for the processing of the data was it discovered that two of the attributes were perfectly correlated. For example, if one variable was color, say puce, lilac and mauve, and another was size, say small, smaller and smallest, the design was such that small was always puce, smaller was always lilac and smallest was always mauve. The upshot of such a design is that one can separate neither the utilities nor their importances for these particular two attributes. (Cue the sound of a toilet flushing research dollars.)
One of the points here is that there are a growing number of sources that will show you how to set up the appropriate experimental design for projects such as these. But as researchers we need to ask ourselves if we should put such unwieldy projects into field without a lot of prior thought. Another point is that inexpensive experimental designs are not always good experimental designs. Penny-wise and pound-foolish would certainly apply here.
Regression analysis abuse
Ordinary least squares regression analysis is a powerful, widely used tool. Everyone learned all about it in statistics 101, right? Well, maybe not quite all. It seems that regression is widely abused, due to some fundamental misunderstandings. So the following list includes both some generalities and specifics.
- Here’s something that causes lots of head scratching. When using a forward selection or stepwise procedure for selecting independent variables to go into the final model, once in a while you’ll find that nothing is related strongly enough to the dependent variable, say overall rating of the product, to enter the model. Thus, you’re left with nothing to report. However, what could be happening is that had you forced the variables into the model, the overall relationship might have been statistically significant, even though none of the attributes alone were significant. Doesn’t happen often, but sure does happen.
- Short of going through a class in projective geometry, you’ll have to accept this one on faith: negative signs on regression coefficients are not “wrong signs” nor are they necessarily intuitive. An attribute may be, all by itself, positively, i.e., have a plus sign for its regression coefficient, related to the criterion (overall rating, remember). Yet when introduced into a model with other variables, the sign may go negative, but still be significant, or may become statistically not significant - in which case the sign is immaterial. The same is true of an attribute that was originally negative. Even more baffling to some is that an attribute that is not significantly related to overall opinion by itself may show up with either a significant positive coefficient or with a significant negative coefficient. As the King said, “It is a puzzlement!” There have been some interesting gyrations as suppliers tried to explain these occurrences to their ultimate clients.
- Speaking (well, writing actually) of gyrations, we’ve all seen lots of manipulations performed to allocate percentage of variance attributable to each independent variable in a multiple regression model. It’s tough to do because the so-called independent variables aren’t truly statistically independent. Thus, there’s no really good clean way to do this allocation; shared variance caused by multi-collinearity is the problem and, yeah, it makes life tough for some of us in cases like this.
- Here’s one from the archives, which is included here since correlation is so intimately related to regression. A series of yes-no answers were coded twice; the first was Yes=1 and No=0 and the second was just the opposite. Then correlation coefficients were generated between the two sets of recodes. Duh!
- In another instance, the task was to estimate the relationship between price charged and quantity demanded, a la basic Economics 102. So data were collected asking respondents how many they would purchase at various prices. Then, for some unknown reason, price was selected as the independent variable and quantity the dependent in the regression equation. Out pops the resulting simple regression equation, which was immediately recognized as the wrong one. But, instead of rerunning the analysis correctly, the analyst fell back on basic algebra and solved for quantity as a function of price. Intuitively O.K., but really a major blunder.
As noted, regression analysis is very widely used in marketing research studies. It’s our friend and, of course, everyone understands all of the subtleties and nuances, right? Well, probably not. So tread very carefully with even these innocuous-seeming analyses.
Missing data/item non-response abuse
Much as we’d like to believe otherwise, our customers really can’t evaluate all of the dimensions of all products or services that you may want to query them about. So it’s sometimes OK for someone to respond with a, “Gee, I really don’t know what score I’d give to _________ because I don’t use that part of the product.” A personal example, if you please. I always use the drive-through window of a particular branch bank. If I’m participating in a survey about that branch and you ask me anything about inside service, I’d have to say, “Beats me.” I can respond to branch convenience, hours, drive-through efficiency and a myriad other things but not to everything. So, in a sense, no answer is an answer. Let’s look at some interesting ways of handling missing data.
- As sometimes happens, there will be a respondent or two or 10 who won’t answer anything at all. In one particular study, the task was to factor analyze the answers to a series of 50 questions. The project director decided to substitute the means of all respondents who did answer a given question for corresponding missing data - a common but not necessarily the best solution to item non-response. The two respondents who answered none of the 50 items were also included because the project director wanted her client to get results for the entire sample, not for two fewer than the entire sample.
- Here’s one I love! Data coders put favorable response codes for missing answers in a series of open-ended “What did you think about...” questions.
- This is difficult to explain, so please be patient. A questionnaire was to include 120 scale items for a factor analysis. The research director recognized that this might be a bit of a stretch for respondents, and so he broke the task into four chunks using four independent cells. Cell #1 was to rate statements one through 30, cell #2 handled 31 through 60 and so on. No overlap in the questions from cell to cell. So essentially, each question had 75 percent missing answers as a minimum. How to do the factor analysis? Easy. Just use the mean substitution option available on most computer programs. Thus, for each statement, at least 75 percent of the answers were a constant value. This does verrrry interesting things to the resulting correlation matrix whence factors are extracted.
- In this example, the missing data were intentional. Respondents were each assigned to one of three independent cells. Cell #1 tested product A versus product B, gave a series of diagnostics and an overall preference measure. Cell #2 did the same with product C versus product D and, you guessed it, cell #3 compared product E with product F. Of course, through data analysis magic, the final report showed all six products ranked on all of the diagnostic scales as well as overall preference. This worries me a great deal!
- Maybe there should be missing data when there apparently are none. In one study, 200+ respondents were to rank 32 concepts by preference. Not a single respondent had a single missing data field. Sure raises a red flag, but it might have been legitimate.
As y’all know, there are a variety of ways to handle missing data/item non-response. The ones above are some of the more inventive that I’ve ever seen. The last one leads into the next section.
Online interview abuse
What, you thought Bill Gates’ minions were gonna get off unscathed? No way. Online interviewing may still be a relative infant in marketing research data collection, but still the following did occur:
- A study was conducted in several different countries using identical data collection instruments. The final sample sizes were roughly equal, several hundred respondents each. Holding a couple of countries aside for the moment, the number of questions answered by individual respondents ranged from a low of 24 percent to a high of 100 percent. Most, but not all, questions had at least a few no-answers as well with the highest showing 50 percent non-response for one particular country. So far, then, the data were typical and about like expected (I can’t evaluate an ATM if I don’t use ATMs). In two countries, respondents were “forced” (I don’t think that bamboo splinters were introduced under the fingernails, but something was done) to answer all questions. I find this feature of online interviewing onerous but in many other studies respondents were and are forced to answer everything. Data comparability between these countries is especially bothersome to me for studies like this. (As an aside, we did learn a lot about how native culture influences scale usage.)
- There was an extremely long list of concepts for respondents to evaluate on a 7-point scale. Typical respondents found it easier to give constant answers down the entire list, rather than thoughtfully discriminate between them. The project director decided to eliminate anyone from the analysis who gave constant answers to the entire slate. Of course, that dropped anyone who really felt that each concept was worthy of a 5, say.
- A food-item study was administered to female heads of household since they were seen to be the primary grocery shopper. A telephone follow-up was done to verify that the respondent was actually the FHH. Somewhere around 50 percent of purported respondents claimed that they didn’t even know about the study. Seems that another member of the household, generally a teenage child, was using the computer with mom’s Internet account.
- Analyzing data from one particular online survey showed that income was inversely related to age - the 18-24-year-olds were more likely to be in the $100,000+ category than in any other. Whoa, what happened? The survey designers used radio buttons for respondents to indicate their response to a given item. But, they thoughtfully filled in the first button, so to enter a response different from the first, a respondent had to click on another response. Obviously now, the 18-24 button was first for age, so if you didn’t want to reveal your age you were assigned the pre-existing 18-24 answer. Same deal with income, only for who knows what reason, the highest income of $100,000+ was listed first. So again, if you didn’t tell your income you got stuck into the highest bracket. As long as the IRS doesn’t find out, you’re O.K. but the survey results were sure out of whack!
Be skeptical
Be skeptical and don’t hesitate ask lots of questions of everyone involved in the project. Beware of black box approaches wherein you don’t get to peek inside the box. Be intuitive - we generally have a fair-to-middlin’ idea of what the study will yield. Sure, we’ll see minor surprises but major, earth-shattering shocks are still pretty rare in our business. My guess is that most of you have seen many similar abuses and are working hard to remedy the situations that cause them. Certainly, really digging into the data helps. Don’t think that the canned statistical analysis computer packages can do your thinking for you. But don’t go to the other extreme - one project director used the independent samples t-test rather than the correct dependent samples t-test because he felt that SPSS was wrong in its computations for the latter.
Oh yeah, just one more thing (with apologies to Lt. Colombo)
Like most, if not all of you, I couldn’t function without my computers. I use them day in and day out. (In fact, my wife accused me of loving my computers more than I do her. While not disagreeing, I said, “Yeah, but I love you more than I do the printer.”) Anyway, the term is “discriminant analysis” not “discriminate analysis.” There are a lot of people who have spent a lot of money on a lot of research projects/reports/presentations who routinely accept the latter term, even though it’s not the correct one. Why is that, you sagely ask? My guess is that computer spell checkers don’t recognize “discriminant” and suggest “discriminate” as an alternative, exactly as the one I’m using does. The researcher who accepts the proposed substitution can be left with egg on his face if he’s making a presentation to an enlightened audience. Be careful out there; just because you’re paranoid doesn’t mean that they’re not out to get you.