How to beat the bots

Editor's note: Keith Rinzler is founder and CEO of 1Q. He can be reached at 1@1q.com.

The consumer insights industry should thrive in this digital age, where connecting with consumers has never been easier. Instead, we’re facing a data fraud crisis, initiated by years of neglect toward the respondent experience. This crisis is undermining trust and threatening our future. Despite advances in fraud detection, bot attacks are escalating. We must shift from defensive measures to proactive, structural change to move forward.

Kantar’s report, The State of Online Research Panels, reveals that “researchers are discarding on average up to 38% of the data they collect due to quality concerns and panel fraud...with one prospect citing they had to return 70% of data provided by a leading panel provider.” And this report is not an outlier. A study by Greenbook found that up to 30% of online survey responses are fraudulent and a LinkedIn Pulse article pegged the number at 40%. Data fraud is a subset of data quality but it’s the focus of this letter because it represents the biggest single threat to our industry. Fraud alone has been estimated at 15%-30% of responses, costing the industry billions annually, financially and reputationally (Fast Company, 2022).

The secret’s now out and our industry’s data fraud issues are getting public attention. The rise in survey bots and the resulting data quality crisis is no longer under wraps and media coverage is making clients – and their leadership – fully aware of it. With budgets and jobs at stake, clients are taking this more seriously than ever. Every day we delay in solving this problem brings us closer to a major data fraud scandal that will shake the industry.

Years of mistreatment 

So, how did we get here? Price pressure from the commodification of sample responses has resulted in years of respondent mistreatment, driving away high-quality participants. Meanwhile, sample providers’ business models require constant respondent recruitment, often by any means, just to maintain sample pool size. So they resort to open-source recruiting, where shields at the top of the funnel are lowered, making it easy for bots to join panels and find surveys with minimal verification. Continuous survey opportunities from routers further incentivize bots and professional survey takers.

Since many clients haven’t been fully aware of these intricacies, they’ve trusted insights partners to “handle it,” all the while advancing the “pricing race to the bottom.” This has devalued the cost of high-quality data and perpetuated a lack of fair compensation and robust respondent validation. This repetitive cycle ensures the industry remains stuck in a defensive mode, with bots and other fraudsters continuing to invade panels in droves.

We are now fighting a data fraud battle against bot software writers and other fraudsters that we cannot win with just defensive (reactive) methods alone. Seasoned and determined insights professionals and consumer research firms using the latest anti-fraud tools still struggle to match the sophistication of bots, some now enhanced by AI. To borrow an analogy from the 1983 film “War Games,” “We’re deep into an arms race where each side keeps enhancing their capabilities and the only way to win is not to play the game at all.” 

The root cause of fraud

At its most basic level, the fraud problem is not just about compensation. Sample providers and platforms use routers that direct respondents to posted lists of surveys or offer a “take survey now” button that lets fraudsters choose to take more surveys. 

This “pull” model where sample providers/marketplaces try to pull in as many responses as possible exacerbates the data fraud problem because the posting of these lists or the otherwise offering of an almost unlimited supply of surveys to take at will is the mechanism that incentivizes bot creators and other bad actors with outsized rewards for those who crack the code. In other words, the “availability of choice” provides the fraudsters with their primary incentive: the ability to earn more compensation by taking more surveys. 

The alternative “push” model, used by companies like 1Q, aims to drastically decrease the incentive for fraud. In this model, surveys are only delivered (pushed) to individual, prequalified members based on n-size. There are no lists or “take survey now” options for bots or professional survey takers to abuse. There is nothing a respondent can do to take more surveys and they never know when they will get one. So that removes all the incentive.

Additionally, while there are many ways to commit to this principle, companies like 1Q also have 100% of surveys answered via app or SMS from a validated mobile phone. Single panelist per physical, validated mobile device removes much of the opportunity for bot responses by design. 

So the incentive and opportunity are eliminated. This is how we became 100% bot-free but ours is not the only solution. Other companies, many of whom have signed below, have made great strides in combating data fraud and should be looked at as examples as well. We commend all companies in the space who see the data fraud challenge with clarity and invest meaningful time, attention and capital to its solution.

Offense vs. defense

Almost all data fraud problems disappear if the ability of respondents (human or otherwise) to choose to take more surveys is eliminated. Offensively eliminating the fraudsters' capability (and motivation) rather than defensively trying to detect them after the fact is a far more efficient investment in the long run.

The switch from “pull” to “push” will take time and resources, including a technology investment, but the industry has no choice. This is a proven solution that anyone can implement. In the meantime, some steps can be taken immediately:

  • Treat and compensate consumers fairly. Use only permission-based panels. End the use of river sampling or other non-opted-in respondents. (If we don’t, regulators will make this decision for us.) Focus on an optimal respondent experience. Mobile-only design and the push model are great first steps. So is a “pay-per-response” approach, which discourages unnecessary survey questions with a financial penalty. Provide immediate and tangible compensation without hidden pitfalls. Fair cash compensation is essential for establishing trust. Quality data comes at a higher price.
  • Reduce open-source recruitment. Invite-only panels with rigorous screening should become standard. Start building these panels now and shift work to existing “guaranteed 100% human” panels.
  • Move from online to mobile-only (to the greatest extent possible). Mobile phone only-based registration and response is vastly more bot-resistant than web-based options and increases validation. Most data fraud is web-based, not mobile.

The true cost of low-quality data

I’d like to say that quality shouldn’t be negotiable and that we as an industry shouldn't sell low-quality data at any price but this is not realistic. There will always be companies who seek to sell lower-quality data at a lower price and there will always be clients who take them up on that offer. There’s nothing we can do about that in a free-market economy, nor should we. What we can do is educate insights customers on the true cost of low-quality data and its externalities and begin to charge what high-quality data costs. Clients must understand the real cost of quality data and cheap data should come with clear disclaimers about limitations.

If you provide high-quality data, lean into it and join initiatives that promote and uphold these standards. Join the Insights Association’s data quality benchmarks initiative. Seek ISO 20252 certification. Practice radical transparency concerning respondent-level data, results testing, your answers to ESOMAR’s 37 Questions to Help Buyers of Online Samples and your efforts to move from pull to push. It’s up to us to educate sample-buyers and help them better understand modern sample practices and the ramifications of different approaches to those practices. 

Yes, I’m suggesting that the industry split into two groups: those who provide high-quality data and can prove it (at a higher price) and those who focus on providing more directional survey results at the lowest-possible price. There’s nothing wrong with the latter approach, as long as buyers have transparency and know what they’re (not) paying for. Contrary to many sales pitches, I don’t think high quality can be provided at the lowest cost.

Enter synthetic data 

There are two questions in consumer research. First, are respondents who they say they are? Second, are they answering truthfully? If we can’t consistently answer yes to both questions, the prognosis for our industry as we currently know it is poor. And if we don’t solve this problem quickly, the writing is on the wall…enter synthetic data (many budgets are already being reallocated here). 

We no longer have the luxury of downplaying the simmering data fraud crisis because we are well past the tipping point. Fraud concerns have now spread beyond our industry to our customers, who are well aware, are deeply concerned and are demanding accountability. 

We have an opportunity to bring forward a solution that will restore data integrity, create efficiencies for the bottom line and allow for faster and more reliable decision-making but it requires a commitment to change the model. Building a thriving, engaged, low-fraud panel is very possible. We and others have proven it. 

A call to action

I’m passionate about this and so are the leaders of many other companies, including those who have signed below. We’re all happy to help others start the process and share what we learned as we implemented push vs. pull, online to mobile-only and other solutions. We all need to talk about this, perhaps in ways we haven’t before. So, let’s talk. Reach out to us, we’ll get back to you.

Join us and other companies in aligning with initiatives like IA’s data-quality benchmarking initiative, Global Data Quality and CASE4 Quality and collaborating toward a bot-free future. Get on board with these initiatives, now!

And to our fellow sample providers, let’s collaborate. Many of us are willing to share our own best practices on this and so are others. Together, we can expand task forces, devote conferences to this topic and significantly reduce if not eliminate the bot/fraudster problem. Solving this and bolstering our industry’s reputation benefits everyone. Let’s build a bot-free, transparent insights industry that restores trust and secures our future. Together we can turn the tide on data fraud. 

We, the undersigned, support the need for industry change and commit to being an agent of change in our respective roles.

Tona Alonso, Director, Category Management and Insights – North America, Hasbro

Simon Chadwick, Managing Partner, Cambiar Consulting

Melanie Courtright, CEO, Insights Association

Bob Fawson, Co-Founder, Data Quality Co-op

Angela Glowienke, Senior Manager, Consumer Insights, Pura

Cynthia Harris, Founder, 8:28 Insights

Kerry Hecht, Founder/CEO, 10k Humans

Brad Larson, Founder/CEO, Ironwood Insights Group

Bob Lederer, President, RFL Communications

Dana May, Manager, Customer Research and Insights, Sally Beauty

Wes Michael, President and Founder, Rare Patient Voice

Lenny Murphy, Chief Advisor for Insights and Development, Greenbook | Gen2 Advisors | Veriglif | Savio 

Adam Roberts, Senior Product Manager, Survey, Numerator

Colson Steber, Co-CEO, Qlarity Access

Brett Watkins, CEO, L&E Research

Christopher Webb, VP Marketing – Head of Brand Management, The Honey Pot Company

Bump Williams, CEO, BWC Consulting