Editor’s note: Annie Pettit is a research methods consultant based in Toronto. 

Marketing researchers fret meticulously over data quality. We hate cheaters, fraudsters and liars. We hate when people don’t pay attention when they complete our wisely thought-out questionnaires. 

abstract image of a brain with arrows pointing in multiple directionsKnow what else I hate? When clients find typos in my work. When I misunderstand someone and laugh at what I thought was a joke but was actually horrific news. When I buy a third box of baking soda because I forgot I bought one last week and another one last month. (Could I not make that mistake with brownies?)

Researchers excel at faulting participants for the same behaviors we accept from ourselves. We’re supposed to be experts in human behavior, yet we are prime examples of actor-observer bias (e.g., participants are cheaters but I just made a little mistake) and hostile attribution bias (e.g., participants are evil, not bored). 

To become impartial scientific observers, we’ve learned to cast aside our human shell. We disregard human experiences wherein people get tired, bored, distracted and hangry. That people misread, forget and make mistakes. That people have plenty to do other than carefully answer a multitude of overly precise and boring questions on their cell phone for six measly dollars an hour.

A problem from day one

Twenty years ago, when the Internet was an emerging technology and paper questionnaires were the norm, I spent a few years researching the data quality of online surveys. Know what I found? People have been giving socially desirable and random responses to online and paper questionnaires from day one.

Since then, I’ve tested and retested data quality among at least 10 different access panels and the same thing always happens. A minority of people, maybe 20 to 30 percent, complete questionnaires without any mistakes. The natural disposition of data provided by human beings is poor data quality. In other words, normal data quality.

The only conclusion is that the overwhelming majority of research participants aren’t cheaters, liars or fraudsters, and it’s disrespectful and arrogant to think otherwise.

fraud preventionOf course there will always be a tiny minority, usually less than 5 percent, who participate with malicious intent. Experienced researchers don’t fear these meddlers as there are many techniques to identify cheaters and liars throughout the questionnaire completion and analysis process. From red herrings to statistical outlier analysis, a wide range of data quality techniques can be concealed in well-designed questionnaires. When these techniques are well done, no one but the researcher who designed them can identify them. No participant can deliberately pass them. Well-designed and well-analyzed questionnaires eliminate the vast majority of records from mal-intentioned participants.

Thanks to technology, we get to worry about new types of data quality. People can sign up for 100 Gmail accounts and register each one on 100 different access panels. Eight years ago when I worked at Ipsos, we had technology in place that easily identified that type of behavior and we took great joy in quashing these accounts before they were ever sent one questionnaire. 

Today, researchers are dealing with cell phone farms where hundreds of individual phones are programmed to complete a task. Share this fake news! Like this Facebook page! Take a survey now! We’re also dealing with bots that can answer thousands of questionnaires in mere minutes or seconds. 

Know what? Panel companies have techniques in place to stop the vast majority of these fraudulent completes. As Steve Mossop, president at Angus Reid Forum says, “Panel companies like ours have technological security measures in place to eliminate bots – but more importantly respondent engagement/logic measures in place along with data analysis techniques that ensure respondents are being truthful and accurate in their opinions.”

Focusing on what is in your control

What worries me the most about research data quality? 

Not technology. Access panels devote teams of people and the most up-to-date tools to vastly reduce the impact of bots and harmful technology. Most research data sets from top-quality panels have very little fraudulent data.

Not cheaters. Skilled researchers design questionnaires with appropriate sample sizes (e.g., not cost-cutting nor time-saving sample sizes) and many different data quality techniques to eliminate this data. Most research data sets from top-quality researchers have very few cheaters.

Not imperfect human beings. Human behavior isn’t poor quality. It’s human behavior. Every data set is full of people being people.

I worry about the things within my control. I worry that research providers and buyers aren’t writing questionnaires that work for people. And we can fix this.development to success and growing growth conceptWe can respect and account for human bias. That means not being afraid to use “Don’t recall” and “Don’t know” as answer options. That means designing questions to account for the fact that human beings are predisposed to fudge answers to our unclear, incomplete and uncomfortable questions. 

We can use plain, engaging language. It’s OK to say “No worries” and “Thanks a bunch.” Research participants love it when I offer a bit of humor by saying, “May the survey force be with you.”

We can share research results with participants even when it seems like everything is proprietary. I assure you, it’s not. 

We can insist on and embrace using at least five data quality techniques in every single questionnaire. And be OK when every human being triggers at least one of those techniques. Because people aren’t robots.

We can stop calling research participants cheaters, liars and fraudsters.

Are you with me?