Evaluate Your Research Partners on 7 Key Data Quality Criteria
Editor’s note: This article is an automated speech-to-text transcription, edited lightly for clarity.
On November 20, 2024, quantilope sponsored a session on data quality criteria to look for when finding a research partner. The organization was one of six sponsors of the Quirk’s Virtual Sessions – Data Quality series.
quantilope’s presenters Lindsey Guzman, Ph.D., solutions consultant, and Andrea Podel, associate director, global panel consulting, gave seven aspects of data quality to check for when looking for research partners like data cleaning technology, survey design and more.
Session transcript:
Joe Rydholm
Hi everybody and welcome to our session “Evaluate Your Research Partners on 7 Key Data Quality Criteria.”
I’m Quirk’s Editor, Joe Rydholm and before we get started let’s quickly go over the ways you can participate in today’s discussion. You can use the chat tab to interact with other attendees during the session and you can use the Q&A tab to submit questions for the presenters during the session and we will answer as many questions as we have time for during the Q&A portion.
Our session today is presented by Quantilope. Andrea, take it away!
Andrea Podel
Thanks Joe and thanks everyone here for joining quantilope’s webinar about how to evaluate your research partners on seven key data quality criteria.
Before we go into the presentation, I just wanted to share some information about quantilope in case any of you're not familiar.
quantilope's end-to-end consumer intelligence platform arms insight professionals with the most advanced research technology. Through automated tracking and technology and 15 fully automated advanced research methods, we empower brands to feel confident in their decision making based on real time insights that provide clear recommendations.
Also, just a few weeks ago, quantilope was named the number one top technology provider in the Greenbook Grit report. That recognition is a testament to our commitment to not only innovation, but also to delivering high quality data that drives accurate and reliable insights.
High quality data is the foundation of reliable insights, but ensuring its accuracy and objectivity is an ongoing challenge for researchers. This session will reveal seven key questions to ask your research partners about their data quality processes, ensuring your insights are built on high quality responses from real consumers.
Now we're going to just do a quick introduction. I am Andrea Podel. Nice to meet all of you. I'm associate director of Global panel consultant at quantilope. I have over 13 years of experience focused on global panel and field management spanning across B2B, consumer and health care market research.
I've been at quantilope for about one and a half years where I focus on panel partnerships and fielding efficiencies, as well as lead a team of panel consultants who leverage best in class sampling practices to ensure clients receive the highest quality data with maximum efficiency.
Now, I'm going to hand it over to Lindsey to do a quick introduction as well.
Lindsey Guzman:
Thanks Andrea.
Hi everyone, my name is Lindsey Guzman. I'm a solutions consultant here at quantilope.
I've been with quantilope for about three years now, but I have 15 years of research experience. I have a Ph.D. in sociology with expertise in survey methodology and design.
I'm so very excited to talk to you all today about data quality.
Andrea Podel
Thanks Lindsey.
All right, let's move along.
First, I just want to introduce a poll to all of you, keep our engagement here with the presentation.
Have you ever had a delay in your research due to bad data? You should see shortly a pop-up on your screen to answer that question, so please move forward with answering that and then if you'd like, I'd love for you guys to share some experiences in the chat, and we can see what's going on there. Thank you.
We'll just give it a couple more seconds on the poll.
All right, so with an early review of the poll, 93% have said ‘yes’ to this question, which I am not surprised. That is part of the reason we are having this presentation.
Let's see if we have any feedback in the chat. All right, well I'm going to continue and see if I happen to see any comments in the chat.
As expected, that is pretty high. 93% and according to a 2023 Insights Association member survey on data fraud, 64% have had a project delayed or negatively impacted by fraud. Additionally, 56% claim their decision making has been impacted by fraud.
One audience member said, “gained fake surveys either for farming incentives or purposely biasing results.” Someone mentioned they tossed well over 50% or even higher when I bought hit their project. I've been there too, Holly.
I hear time over time about removing somewhere like 20% plus of their data to the bots and bad open ends and then having to go back into field, which could add days or up to a week in fielding when you're already probably behind on analysis and reporting.
We're here today going to dive into seven key data quality criteria that will help prevent these negative impacts for you.
First, let's talk about the criteria to evaluate your research partners. Here are some high-level topics we'll be discussing today.
- Data quality basics – What are the basic data quality standards you should look for in your research partner?
- Panel sources – Where does your sample come from? How do you evaluate sample quality?
- Data cleaning – What are your data cleaning practices?
- Survey design – How can you ensure your data is valid and reliable? And how do you ensure a survey is optimized for your respondents?
The first question to ask your market research partner is ‘what are your basic data quality standards? Do you have an internal panel team?’
Dedicated panel experts that can help find the best sampling options for your target audiences and ensure high quality data along the way. Every company that has an internal panel team may work a bit differently.
You may want someone who will manage your fielding from end to end, allowing you to focus on the reporting analysis. There are partners who offer this or if you want to manage the fielding on your own, there are partners who have direct access to sampling platforms and have panel experts help only when facing fielding challenges or any specific questions.
Again, it really depends on what your need is.
What are your fraud prevention solutions?
Fraudulent respondents are becoming more and more intelligent with AI. It is important that companies stay ahead of this by implementing technology to prevent this.
This can include digital fingerprinting, which is information collected about a device for the purpose of identifying a specific individual.
And secure end links, which prevents fraudsters from manipulating an end link to be directed as a complete without actually completing the survey. Been there a few times unfortunately.
Additionally, link hashing, which prevents unauthorized access or manipulating the survey URL, allowing only legitimate respondents to participate in the survey.
Lastly, are you ISO certified?
The ISO certification provides added assurance that all company data is under the best possible security protocols. You'll want to work with a partner that is ISO certified and specifically ISO 27001 certification ensures that an organization's information security management is aligned with best practices.
So far, we've reviewed basic standards around data quality. Now let's dive into how to evaluate specific panel capabilities.
So, where does your sample come from?
When evaluating a market research partner, it is important to share your target audience from the start. This ensures the partner can effectively identify and recruit your desired audience, whether it's a niche demographic, specialized professionals or geographically dispersed population.
This upfront communication fosters a strong partnership and increases the likelihood of achieving your research objectives with high quality data. There are a few options from where companies can source their panel.
First a proprietary panel, what is this? It is an online group of individuals who have agreed to participate in a survey exclusively for a specific market research company or platform.
Next is an external panel. This is a group of individuals who have agreed to participate in a survey source from a third-party provider offering their services to market research companies.
This can also include panel marketplaces, which are online platforms that connect researchers with diverse networks of panel providers and respondents. They act as intermediaries allowing researchers to access a broader range of survey participants and choose the best panel source for their specific needs.
Now both types of panels have their own benefits and challenges and we're going to go into a bit more detail of that.
For example, when using proprietary panels, the benefits are highly engaged panelists, quick setup, fast turnaround times and cost effective since that cost is internal with the market research panel.
However, the challenges can be possible are
- Demographic limitations – Are we able to get the census representation that we need with this panel?
- Panel conditioning – Are these participants completing surveys frequently, therefore expecting to know the answers and being conditioned to participate?
- Limited flexibility, which also is related to targeting niche audiences – Are we able to target very specific low IR projects with these proprietary panels?
On the other hand, when using a partner using external panel sources, this will give you access to any panel niche audiences with specialty panels. There's a lot of actual panels out there that can target specific niche audiences, dare I say fishermen, aerodynamics, anything like that. There are panels out there for almost everything, so having that ability to reach out to any panel is really beneficial. Having that global reach, reaching out to markets that are beyond Europe, Asia, the EMEA markets, Middle East markets, having that reach there as well. And excluding past participants with the idea that you can reach a much larger sample size.
Then of course with an external panel using multiple partners to recruit very challenging studies. This can also reduce bias or panel condition as we mentioned could be a challenge from the proprietary side.
However, there could also be challenges related such as potential delays. If you are with the market research partner, you're ready to go, but we need to connect with a partner, a panel partner to get the sample going. There could be a couple hours delay there, higher costs are involved since again, it is not an internal proprietary panel.
There also is the need to do your due diligence. The market research provider is doing their due diligence and vetting the panels to make sure that they have high data security and ensuring all NDAs are in place since we are potentially sharing this data back and forth.
Additionally, a little bit more on marketplaces, they can be really great for high IR gen pop sample, but many market research partners use them, and it can be a very quick recruit for these and very cost effective. As long as the clean outs remain low which we'll talk about a bit more, I really don't see any issues with this. Some also even have their own technology for ensuring data quality within surveys.
The challenge with these marketplace platforms just comes with recruiting those niche audiences, which we've already talked about a bit, is making sure that you can find those proper audiences.
All right, so overall your partner has a proprietary panel or an external sample. It's still important to consider these, the panel composition and ensure it fits your needs. Here are some considerations; understanding the panel size, global reach and demographic composition.
Does this fit your research needs? Are you able to reach the audience you need? How often are panelists participating in research? How many projects are you running a year?
Say you're managing maybe 10 or 15 projects a year. Are you able to tap into fresh respondents for each survey or do you need to reuse them after a certain amount of time? This can be even more challenging with smaller proprietary panels.
Additionally, you can always add past participation exclusions on specific products or topics to avoid any biased responses.
What are their targeting capabilities? While many have large panels of gen pop consumers, it is helpful to have targeting abilities for specific product usage or ownerships, which will help with quicker recruiting and higher incidence rates.
All panels have varying targeting, so it's helpful to have a few panels available to determine the best fit for your specific research.
A lot of times at quantilope we will have a niche audience, and we'll reach out to maybe our standard go-to partner, but if they're not able to target that and another provider is we will likely go with that other provider. It really does make a difference in feasibility and cost as well as incident rate.
And then of course, niche audiences such as business professionals or even new moms can be quite challenging to recruit. Does this panel fit those needs?
And lastly, trackers. I know we all know these well. They typically require large sample sizes and exclusions throughout the year. We here at quantilope typically recommend excluding respondents for up to three to six months depending on the audience. Is this panel feasible for this level of sample?
All right, so now that you know a bit more about proprietary panels or external panels, I did want to open up another poll to share your thoughts on ‘do you prefer to work with market research partners who have proprietary panels or use external panels?’
And again, if you have a chance, please share your reasons why in the chat. And I'm going to jump over here to take a look.
All right, so it is a bit of a mixed response, but right now we're seeing 65% of you prefer proprietary panels while 35% of you prefer external panels. I'm going to jump over to the chat and see what you guys have to say.
“A mix of both usually to overcome challenges posed by both panel types.”
That's a fair answer there. Absolutely. They definitely have their own benefits and challenges.
“I said I prefer external only because it is unlikely one panel can provide all that we need. I'd love to find a proprietary panel that is broad enough with a high enough volume.”
Absolutely. Great, thank you all for your feedback there.
Now we're onto question three. We already learned about recruitment capabilities for panel, but now let's dive into a evaluating the actual quality of their sample. There are three considerations here.
First, respondent sourcing and verification. Sample providers should actively verify the identity and eligibility of participants when they join the panel.
Where are these panelists being sourced from? How are they being verified to avoid fraudulent respondents? Do they use digital fingerprinting and device identification techniques to detect duplicate accounts or suspicious activities? Do they monitor respondent engagement and participation patterns to identify and address potential quality issues?
Next, which I briefly brought up earlier, is cleanup rates. Understanding a sample provider's average clean out rates will help understand the quality of data entering your survey.
Usually, I like to see about eight to 10% clean out rates for the partners we work with here at quantilope. If I start seeing clean out regularly, falling above upwards of 20%, we'll pause that. We'll meet with the partner, we'll review what's going on and see if there's any solutions that we can put in place or the panel provider can put in place to avoid that.
Now there are always these one-off cases where I think someone mentioned it, a bot may come through causing a 50% cleanout rate, and that's pretty obvious at the time when it comes up and we'll talk more about that as well, but that could just be a one-off case that we and the partner can easily resolve for the next project.
Next up is quality pre-screening. Many providers use proprietary technology or even third-party service partners, such as Research Defender, to protect and prevent against frauds and bots.
These services ensure a high-level of data quality by protecting your projects against non-human and suspicious respondents before they actually enter the survey. It analyzes IP addresses, device fingerprints and other behavioral patterns to block any non-human or suspicious participants.
Another question that we receive a lot with regards to panels is ‘How do we ensure data quality within multiple panels?’
First, we really try to minimize the need for multiple panels. We always strive to sample from a single source, but someone actually already said that ‘we'd love to be able to do that,’ but some cases really do not allow for that.
For example, if you're looking for an underrepresented audience and the incidence rate is very low or panels are unable to target a specific product or brand.
Another example could be for trackers as already mentioned, where the sample size is very high and requires monthly waves or even quarterly waves with very large sample sizes. Specifically, when using multiple partners for tracker studies, you will need to ensure the sample blend remains the same wave over wave. This is incredibly important so that the data does not get skewed in any direction by having different amounts of sample coming from the different providers.
What does this actually mean?
If panel one recruits 60% for the first wave and panel two recruits 40%, we will need to ensure that they are following the same percentage for all future waves to keep that data consistent.
Additionally, which this will be coming up a few times, but it's imperative to use a pre-screening service when utilizing multiple panel sources. It can catch duplicates between the panels and prevent them from entering the survey, which is probably one of the first questions that come up is how do we ensure they're not participating in both panels?
That's where this prescreening service comes into play with these solutions. I don't actually have a big concern with using multiple sample partners for a project.
We've reviewed panel source considerations and sampling quality. Now we're going to dive into the data quality measures specifically within the market research platform.
The next question to ask your potential partner, ‘what are your data cleaning practices?’ You should understand the pre-, mid- and post-survey options. I've already spoken a few times about the pre-survey screening, but now I just wanted to show a holistic approach of data cleaning practices. As a reminder, again, pre-survey screening is used to stop fraud spots and duplicates before they actually enter the survey.
Now let's look at what we can do during fielding and after fielding to ensure high quality data mid survey. So really this includes intention checks to help filter out unengaged participants and bots.
Some examples are instructional attention checks. These checks require respondents to follow specific instructions and ensure they are reading and understanding the question. An example would be please select strongly agree, or for the next question, please choose option C regardless of your actual preference.
Next is semantic attention checks. These checks assess whether respondents are paying attention to the meaning of the question and providing consistent answers, asking questions about a specific topic and then later on asking a contradictory question on the same topic.
Open-end attention checks. I'd like to say we all agree here, but it is very important to always include an open-ended question within your survey. Whether you're using that for data or not, it really helps weed out those bots and unengaged respondents.
So, it could be as simple as please describe your favorite hobbies in a few sentences or what are the thoughts on the current economic situation or just why do you like to use this specific product that you just mentioned in the previous question?
Next is image-based attention checks. These checks use images or visual cues to assess attention and comprehension.
Showing a picture and asking a respondent to identify a specific object or detail within an image. Some of you may have seen click all the dogs within the question. Just stuff like that really to make sure that everyone is paying attention.
And lastly, or another one is red herring questions. These questions introduce irrelevant or nonsensical information to see if respondents are paying attention to the context, for example, asking about a fictional product or brand.
When implementing these attention check questions, please be mindful not to make them too challenging where it could end up causing valid engaged respondents from disqualifying.
I've actually run into a few instances in the past where a fake brand sounded very similar to a plausible brand for a specific product, and we ended up losing a lot of genuine respondents because they were just making a mistake in answering that question and we ended up having a lot of clean outs for that.
All right, now we're going to do one last poll for the day. So how often do you use attention checks questions in your survey?
Again, let us know which questions you prefer, which type of attention check questions you prefer in the chat.
Oh, we're getting a good range here. This is interesting.
Where I see it now, we have ‘never’ and ‘all of the time’ tied at 20%, ‘rarely’ at 12%, ‘sometimes’ at 33% and ‘often’ at 15%. So ‘sometimes’ is currently winning here.
Let's see what the chat has to say.
“Open-end questions are crucial through things like ChatGPT are making review challenging. It is a long, well thought out response and is often copied and pasted.”
Holly, I'll talk about that actually right after this.
“Semantic instructional and image base.”
Yep. Great.
All right, well now I'm eager to get to the next slide to touch on that, Holly.
All right, so post-survey. As we already talked about, automated data cleaning saves you time and hassle. Many times, these are applied once fielding is completed.
However, at quantilope we actually have the ability to apply these data cleaners while the survey is in the field. So, say if you have 100 out of your 400 respondents, we can apply these data filters and remove those people automatically.
This will obviously speed up the process and eliminate the need for going back into the field. Once you've had 400 completes, you clean out 50 of them and have to go back and field. It really helped us speed up the process.
When considering a platform, it's important to learn what automated flags are available to help with that data cleaning versus you having to do it manually. And there are some things we have and I know a lot of other providers may have specific to open text, so I'll speak to that.
So first, having flags for speeders and slow pokes. This means respondents who finished the survey at a certain threshold above or below the average straight liners, which flag respondents who respond the same way to every question.
And then open-end cleaners. We have flags that will flag if someone has gibberish responses, copy and paste responses, basically identifying instances where participants are copying and pasting rather than their own authentic open text. But of course, we do still recommend that manual open-end review and there may be other AI options to help support possible flags for you to double check manually.
But I still always strongly encourage a manual OE review. Mainly for contextual review and ensuring the answers make sense and weeding out any repetitive bots.
Also, at quantilope, we have data cleaning flags for max-diff, implicit and price sensitivity meters. These specialized data cleaning flags help us find inconsistencies in participant responses, making sure your data quality is high.
All right, now I'm going to hand it over to Lindsey who's going to review how to optimize your research design to ensure high quality results.
Lindsey Guzman:
Thanks, Andrea.
Panel quality is so important. Understanding where your data comes from is really important, but there are also things that we can do as researchers to make sure that our data is valid and reliable.
One of the first things that I would recommend is leveling up your research game with something called advanced methodologies. What exactly are advanced methodologies?
Think of them as more sophisticated and nuanced ways to gather insights from your audience. They go beyond simple questions and rating scales to uncover deeper preferences and motivations. Here are a few examples.
Something like a conjoint analysis can help you to understand how people make trade-offs between different product features. Imagine you're designing a new smartphone. A conjoint analysis can tell you whether people care more about screen size, camera quality or battery life.
Another method is called max diff, and this is about figuring out people's most and least favorite items, claims, flavors, etc.
So, let's say you're testing different ice cream flavors. Max diff can help you pinpoint the flavors that are absolute winners and the ones that might need some tweaking.
My personal favorite is an implicit association test. This is a method that helps tap into people's subconscious associations.
For example, an implicit test could reveal whether people unconsciously associate your brand with positive or negative emotions.
Why are these advanced methods so great for data quality?
One, they give you a much deeper understanding of what's driving people's choices. They're less susceptible to biases that can creep into traditional surveys. And many of these methods force people to make choices which reveal their true preferences more accurately.
Of course, don't forget the importance of rigorous statistical analysis. This helps ensure that your findings are reliable and truly reflect what's going on with your audience.
If you're serious about data quality, consider adding some advanced methods to your research toolkit.
The second component of valid and reliable data is data cleaning. Andrea touched on this earlier, so I won't go into detail here, but this is an important component of ensuring that our data is valid and reliable.
Next, we should include open-ended questions in our surveys. As Andrea mentioned, these are the questions that give people the freedom to answer in their own words.
I did see in the chat that some of you are concerned about ChatGPT. Fortunately, with the quantilope platform, we are able to use those flags. So, it is really easy to determine if we are talking to actual humans.
I always highly encourage my clients to utilize open-ended responses wherever possible to make sure that we're speaking to a human and to also validate our quantitative findings. For example, when people say they really like or dislike a new product feature, we want to dig into why that is.
Let's not forget the basics. Good survey design is like building a house. You need a strong foundation to support everything else.
So, let's keep in mind some key principles like clear question wording. If your questions are confusing or ambiguous, how can you expect people to give you accurate answers?
Use simple, straightforward language. Avoid jargon or technical terms that might go over people's heads. Be specific and avoid double barreled questions, the questions that ask about two things at once.
Most importantly, make sure the questions you're asking are relevant to your research goals.
I work on reviewing a lot of our client surveys and sometimes they have a kitchen sink approach where they're doing this one survey, and they want to ask a bunch of things. Let's be strategic and make sure that we are only asking the questions that are relevant for our particular project.
We should also make sure to use unbiased language. Let's keep any bias language out of survey questions and avoid leading questions that suggest a preferred answer.
Use neutral language and avoid emotionally charged words that might sway people's responses. Be mindful of cultural sensitivities and potential biases related to gender, race or other demographics.
Lastly, let's make sure our surveys have a logical flow. They should move like a conversation from one topic to the next group of related questions and use clear headings and transitions. Consider using branching logic to tailor the survey path based on individual responses.
And don't forget to thank people for their time at the end.
Alright, now let's shift gears a bit and talk about another crucial element to data quality. Like how to make your surveys enjoyable for the people actually taking them. Because let's be honest, no one loves filling out surveys. But here's the thing, if people enjoy your survey, they're more likely to give you thoughtful, accurate answers and that means better data for you.
How do we make our surveys less like a chore and more like a conversation?
First, we should always design our surveys with time limits in mind. Try to keep your surveys as concise as possible.
Respect people's time. We recommend aiming for 10 minutes or less because no one wants to spend their entire lunch break filling out a survey.
It's also helpful to give people a sense of progress. For example, interjecting throughout the survey telling respondents what the next section will be about and how close they are to finishing the survey to help keep them motivated.
Another consideration is designing with a mobile first mindset.
Let's face it, most people are taking surveys on their phones these days, so make sure your survey looks amazing and is super easy to use on any device.
Use big, clear buttons and fonts so people can easily tap and navigate on their touch screens and avoid complicated layouts or questions that require a ton of scrolling. This could lead to data quality issues if respondents don't scroll and therefore don't see all available options.
Finally, introduce an element of gamification.
Gamification is about making your survey more interactive and engaging. Use question types like advanced methods, sliders and image selections. These make the survey feel less like a test and more like a game. Or you could add some visuals or even videos to break up the text and keep things interesting.
Don't forget to give people positive feedback and encouragement along the way.
By optimizing your surveys for the people taking them, you'll get higher completion rates, reduce bias and ultimately gather more accurate and reliable data.
Happy respondents equal better insights. So put on your UX designer hat and make your surveys fun and engaging.
Alright, and now we've covered a ton of ground today from a deep dive into data quality panels, data cleaning methods and survey design.
As we wrap up, I want to leave you with this key takeaway.
quantilope is here to be your partner in achieving data quality. We've built our platform and processes with a relentless focus on ensuring your insights are accurate, reliable and actionable.
Here's how quantilope helps you get the most out of your data. We offer built-in data cleaning features.
We have six built-in data cleaning flags that automatically identify any respondent who might be giving you questionable data. We also partner with top-notch panel providers who adhere to the highest industry standards for data quality.
Plus, our fielding experts are here to guide you on best practices and monitor your data closely.
We also offer a pre-survey defense module, which helps ensure an extra level of data quality by protecting your projects against non-human and suspicious respondents before these participants even click into your survey.
We are ISO certified, which means we take data security and privacy incredibly seriously. Your data is safe and sound with us.
As you embark on your next research project, remember that quantilope is here to support you every step of the way. We'll help you gather data you can trust so that you can make decisions with confidence.
Thank you for joining us today on this data quality journey. I think we will now open it up to questions.