Where are you from?
Editor’s note: Denise Brien is senior research manager, Melanie Courtright is vice president, and Marjette Stark is senior vice president, at Lewisville, Texas-based DMS Research.
Given the distinctive nature of river samples as opposed to more well-known panel samples, it is not surprising that there is continuing uncertainty about this resource among researchers and sample buyers. To address questions regarding river’s use as a valid sample source, and add to the body of knowledge on online data quality, DMS Research executed a comprehensive study evaluating river respondents and ongoing panelists. The resulting research sheds light on the differences and similarities between the groups and clarifies how these might influence the research design and conclusions resulting from the use of a particular online sample.
The original intention of this research was to better understand the similarities and differences between river sample and panelists. However, in order to put these online sampling methodologies into a broader quality context, CATI respondents were also surveyed. The research utilized the following sample sources:
-
DMS river sample (using DMS’s proprietary Opinion Place River Sample);
-
panel sample (using DMS’s proprietary SurveySpree panelists and two external panel samples; all three panel samples were analyzed in aggregate for this study);
-
CATI interviews (RDD).
Interviews were completed during December 2007 and the analysis conducted during January 2008. The survey received 2,412 responses; at least 400 responses were collected in each sample cell and statistical significance was tested at 95 percent confidence. Quotas were set to control for gender, age, income and ethnicity, and were used to ensure that each cell reflected the overall U.S. population (according to U.S. Census estimates).
Precautions were taken at every step to ensure that the research was as unbiased, clear and objective as possible. In addition to surveying two different external panel samples, the sources of which were unknown to us, we also outsourced the data collection, tabulation and statistical analysis to further ensure objectivity of results.
An overview of the sampling methods used in this research
To ensure that the study results are uniformly understood, the three sampling methods used in this research are outlined briefly here.
-
DMS River Sample is an online sampling method that is wholly and solely sourced from online promotions (e.g., banners, pop-ups, hyperlinks). This method of random recruitment drives potential respondents to an online portal where they are screened for studies in real time. Qualified respondents are then randomly assigned to a survey. Due to differences in the sampling and recruiting methods of other river samples, the research results presented herein refer only to DMS River Sample.
-
Online panels are generally recruited from affiliate and partner sites, through which members are registered and encouraged to become long-term survey participants. Once respondents agree to participate, the panel company sends e-mails to invite respondents to a particular survey, and interested respondents who qualify then complete the survey.
-
CATI (computer-assisted telephone interviewing) refers to the traditional, random-digit dial approach in which a telephone interviewer calls random telephone numbers, screens potential respondents in real time, and asks interested, available and qualified respondents to answer questions while the interviewer enters their responses into a computer program.
The respondent profile
Demographic and other differences
Demographically, differences among sample populations were limited, in large part due to quotas that controlled basic demographic categories (gender, age, income and ethnicity). Minor differences were observed in marital status (river sample represented a higher proportion of single respondents, while CATI respondents had a correspondingly higher proportion of married respondents) and occupation (both river sample and panelists showed a slightly higher tendency to list “student” as their occupation than CATI respondents).
Beyond demographic comparisons, slight variations among the sample sources appeared in several categories measured, including media consumption, political affiliation, voting history, passport ownership and frequent-flier program membership. However, while differences among these sources existed in these broad areas, there did not seem to be any larger trends suggesting a radically different respondent profile for those recruited online vs. those recruited over the phone, or those who belong to a panel vs. those who do not - at least, not along the dimensions examined above (larger differences in other dimensions are examined in the next section).
Despite the wide range of measures reviewed here and the relatively minor deviations among samples, it is not certain whether measuring additional areas would uncover larger trends. In any case, differences such as those seen in this research are most likely to impact research results only if a survey’s topic were to coincide with the selection of a sample source with a corresponding bias (for example, a survey about frequent-flier reward programs might be impacted by the selection of panel respondents, or a political poll might be skewed if conducted only over the phone). In this case, research can be designed taking potential bias into account and controlled through screening and other measures to ensure the desired representation.
Attitudes
On a wide range of attitudinal statements measured, there was an overall pattern of more similarities than differences among the three sample sources surveyed. Respondents across the board had similar and predictable attitudes toward risk-taking, and there were some similarities in attitudes about finances and shopping. River sample and panelists differed from CATI respondents on technology issues as they relate to early adoption - this trend will be detailed more fully in the next section.
The presence of an interviewer in the CATI sample may have caused a variation in their responses on key attitudinal questions where a “socially acceptable response” was given - as CATI respondents were more likely than online (and thus self-interviewed) sample sources to say that they were happy with their standard of living, to agree that how they spend their time is more important than how much money they make, or to indicate that buying American is important.
Technology and online use
A definite trend emerged when we examined the use of technology and online behavior among sample sources (Table 1). Perhaps not surprisingly, online respondents were heavier Internet users across the board. Whether in tenure (years since first going online), frequency (hours online weekly) or activity (such as making purchases online or downloading music), online respondents diverged from their CATI counterparts. This finding held true even when comparing river sample and panelists to those CATI respondents who were online (i.e., CATI respondents who use the Internet). In other words, CATI respondents who use the Internet did not resemble the online sample sources in how they use the Internet. CATI respondents who use the Internet do a range of activities, but in many cases at only half the rate of the online sample sources. River sample and panelists were in almost exact alignment in their use of the Internet for a range of activities.
When comparing other technologies - even emerging technologies such as HDTV or iPhones - all three samples sources had more in common than not (Table 2). Rates of HDTV ownership were in line with the overall U.S. population; DVR ownership and home cable/satellite service were similar across samples.
The three respondent sources showed relatively similar ownership of cell phone devices such as smartphones and iPhones. With overall cell phone penetration in the U.S. estimated at 82 percent, CATI respondents were more representative of the overall U.S. population on this measure, while river sample and panelists were alike in their greater likelihood to have a cell phone.
Respondent survey history
The area of greatest difference among river sample, panelists and CATI respondents emerged in their overall pattern of survey-taking. While the vast majority of respondents in all three groups had participated in some form of market research, panelists by far had the greatest overall lifetime participation, with CATI at the opposite end of the spectrum and river representing the middle ground: Among river respondents, 42 percent had never participated in an online survey prior to this survey, suggesting that river sampling does reach people who do not typically participate in online surveys.
The results were similar when limited to a more recent time frame: Among online respondents, the rate of river respondents’ participation in online surveys over the last 12 months was almost half that of panelists (45 percent and 81 percent, respectively). Among online respondents, river sample had the lowest percentage of respondents participating in online surveys on a daily (9 percent) and weekly (39 percent) basis. By contrast, almost nine in 10 panelists indicated at least weekly participation in online surveys, with an average of nearly 17 online surveys per month. CATI respondents had the lowest participation in online surveys overall (daily 1 percent; weekly 20 percent).
Survey behavior
Survey and panel participation
Participation in panels was another area where panelists and CATI respondents occupied the extremes of the spectrum and river sample occupied the middle ground. Roughly 80 percent of the panel respondents indicated that they currently or at one time belonged to an online research panel, compared to only 10 percent of CATI respondents and around 30 percent of river respondents. (Panelists who participated on a more frequent basis were more likely to call themselves panelists than those who took fewer surveys per month.) Thus, while some river respondents are also panelists (suggesting participation in surveys on a regular basis), the vast majority (70 percent) do not participate in surveys on a regular, invited basis.
Response quality
This research included a range of data quality measures to evaluate how respondents behave once they agree to take a survey and assess the ultimate quality of their responses. In some cases all three groups performed equally well, while on others, all three exhibited issues (Table 3).
-
Low-incidence questions revealed an equally high response quality across all three sample sources in terms of proximity to benchmarks.
-
The response to open-ended questions yielded a somewhat surprising result, as CATI had the highest percent of non-substantive responses (such as “nothing” or “I don’t know,” etc.) to the question “What made you decide to take this survey today?”
-
To measure item inconsistency, respondents were asked two questions regarding their attitudes toward brand and price. Respondents who agreed with both statements and respondents who disagreed with both statements were considered “inconsistent” responders. There were a fairly large number of inconsistent responders in the total base sample, and no differences in this measure across the three groups.
-
Speeders are known to cause consistency and quality issues. River sample had the fewest number of speeders in this research. However, it is also possible that experienced respondents may just be more familiar with surveying techniques, scales and instructions, and therefore able to more quickly complete surveys.
-
Trick questions look to control for inattentive respondents and foster data quality. River sample did not perform on this measure as well as panelists, raising two questions: 1) Are more-experienced respondents on the lookout for control mechanisms that might block them from future participation in surveys? and 2) Are trick questions confusing to less-experienced respondents who have less exposure to survey instructions, question wording and scaled responses?
-
Use of scale: Respondents used the scale more uniformly on a five-point satisfaction question with only three satisfaction rating attributes. On a seven-point agreement scale with multiple attitudinal statements, only 2 percent of river sample and 3 percent of panelists provided the same response to each statement, a very low percentage of straightlining.
Regarding these measures, this research made no attempt to ameliorate known survey design issues which may have caused quality issues. In reviewing these results, it is important therefore to remember that attention to quality in the upfront design and selection of respondents can greatly improve the chances of the final product yielding the highest quality
Comparison to benchmarks
Respondents were compared to third-party data on a range of non-demographic benchmarks to further examine how these sample sources differ - and how closely each approximates the U.S. population. Benchmarks were culled from census and other government sources, trade group data or large-scale nationwide surveys when independent sources were not available.
On non-technology benchmarks (Table 4), the three sample sources exhibited more similarities than differences - to each other and to the benchmarks. Overall, river sample had the least average variance.
On technology-related benchmarks (Table 5), there was some divergence between the general population and survey respondents, with online respondents in particular showing a tendency toward early adoption and high rates of device ownership.
On these measures, while the three sources again shared similarities, CATI in this case was closest to the overall benchmark average in having generally lower rates of technology ownership.
The calculation for the average absolute error is based on the averaged sum of the variance between the sample source and the benchmark.
Data quality considerations
In summary, there are areas of differences and similarities for each sample source, as well as corresponding data quality considerations.
The sample sources varied from each other in several key areas (online activities, technology adoption, and lifestyle attitudes; and offline activities like travel and smoking), while in other areas similarities among the three exist (such as entertainment/media device use and attitudes toward shopping and risk).
Data quality differences were mixed. Sample groups performed well and did not exhibit differences on measures like low-incidence benchmarks or straightlining, though issues appeared across the board on response inconsistency and data traps. River sample exhibited the lowest tendency to speed through the survey.
The largest and most important difference among the three sample sources appeared in their overall pattern of survey-taking participation. Panelists take more surveys and participate more frequently than other respondent groups, while CATI respondents take far fewer online surveys and participate much more rarely. River sample occupied a middle ground in terms of survey frequency - participating on occasion but not with significant frequency. Participants from all three sources are engaged in all types of research, with surprisingly few people having not participated in any other research over the past 12 months.
When designing a sampling frame, researchers should know the implications of each methodology and carefully consider how to mitigate risk. Thus, based on this research, we believe that panel is a good choice for low-incidence projects; CATI is best for research that might be skewed by online/technology adoption and behaviors; and river is the best option for reaching a random, less-surveyed online audience.