Data Quality, Privacy, and Ethics

November 1, 2024

The Rising Issue of Bad Data in Online Surveys: Causes and Contributing Factors

Learn how market researchers are enhancing defenses and redesigning surveys to combat the growing threat of online fraud in data collection.

The Rising Issue of Bad Data in Online Surveys: Causes and Contributing Factors
Sebastian Berger

by Sebastian Berger

Head of Science at ReDem

Discover the key forces driving market researchers to adopt advanced defenses and redesign surveys to counter the escalating threat of online fraud.

Around the time the ChatGPT hype began about two years ago, concerns about fraud in online surveys grew within the market research industry. In April 2023, organizations like the Insights Association and ESOMAR, under the coordination of the Market Research Society (MRS), launched the "Global Data Quality Initiative," a global campaign focused on combating threats to survey quality, particularly online survey fraud.

Since then, the number of publications on this topic has steadily increased. It is encouraging to see that more researchers are now tackling online fraud through technological solutions. This article delves into the background of this issue and highlights the importance of designing a well-thought-out questionnaire in this context.

Reasons for the Growing Focus on Online Survey Fraud

While online survey fraud has been around for over 20 years, the question remains: why has it only recently gained significant attention? Although the link to the hype surrounding ChatGPT and artificial intelligence (AI) seems apparent, it is not the sole reason. The heightened focus is largely driven by market researchers observing a surge in poor-quality data within their datasets.

This issue can impact roughly half of the data from online panels and nearly all data from freely accessible online surveys. Our recent quality analyses revealed a doubling of interviews classified as fraudulent over the past year. However, the notion that AI is the primary driver behind this increase is only partly accurate. The rise can be attributed to three key factors, with AI playing a role in just one of them.

1. How the Pandemic Unleashed a New Wave of Survey Fraud.

Before the COVID-19 pandemic, professional survey fraud was primarily conducted in centralized settings, similar to call centers, in low-wage countries like China. These fraudulent activities were often detected through participant geolocation and by spotting suspicious behavior patterns, such as repeated fast-clicking (speeding) and uniform responses (straightlining).

However, when the pandemic struck, many workers in these so-called click farms lost their jobs. Facing financial difficulties and limited alternatives, they continued their fraudulent activities from home. To do so, they purchased inexpensive used smartphones to participate in multiple surveys simultaneously. This approach enabled them to earn incomes significantly higher than the average in their countries.

2. Professionalization Through Social Media

Operators of these private phone farms increasingly began sharing their experiences, tips, and tricks on social media. Beyond general advice, these fraudsters disseminate detailed information about specific surveys and their verification checks in online forums and groups.

They teach others how to bypass attention checks and trap questions, enabling a wider network of fraudsters to participate undetected. The potential to earn substantial income with minimal effort through extensive automation led to the spread of professional online survey fraud beyond a few low-wage countries. This resulted in a multinational expansion and a surge in the professionalization of decentralized survey fraud conducted from "home offices."

3. Use of Modern Technologies

Modern technologies like AI, bots, botnets, and VPNs enable private phone farms and traditional click farms to generate numerous fake interviews almost entirely automatically. These fraudulent interviews are challenging to detect because they leave no noticeable digital fingerprints and intentionally avoid typical detection markers such as speeding, straightlining, or nonsensical responses.

Even identity verification through SMS, voice, and potentially visual identity in the near future, is increasingly automated via online services, further complicating fraud detection and prevention. Since these tools are often available as free or low-cost, user-friendly, web-based software, they have rapidly gained popularity among fraudsters.

Technological Defense Against Online Survey Fraud

The rise in poor-quality data and growing concerns about the increasing sophistication of survey fraud, particularly through AI, have amplified the demand for advanced technological defenses. Traditional methods like eyeball checks and manual data cleaning are no longer sufficient to effectively combat online survey fraud. Instead, quality assurance is increasingly automated through new tools designed to reduce both costs and time. These tools fall into two main categories:

1. Pre-Survey Checks

This category includes technologies designed to prevent fraudulent behavior before survey participation. Tools in this group block multiple participations by analyzing various attributes and parameters, such as IP addresses, cookies, and hardware and software configurations. However, fraudsters often bypass these defenses by using tools like VPNs or proxy servers to mask their true IP addresses, creating the illusion of responses from different locations. They also employ specialized browsers and delete cookies to further evade detection.

2. In-Survey Checks

These tools evaluate the quality of the data collected either during or after the survey, filtering out interviews that suggest inattentive or fraudulent behavior. AI algorithms that leverage machine learning are particularly effective because they continuously learn and adapt to new fraud techniques, detecting patterns such as unusual response times, recurring answers, or suspicious similarities between participants.

However, the effectiveness of these checks is highly dependent on the design of the questionnaire. Traditional survey designs may not fully leverage the capabilities of these modern tools, making them less effective at detecting newer fraud methods. In a follow-up article, we will explore how to craft a questionnaire that unlocks the full potential of technological fraud detection.

Conclusion

As the market research industry continues to grapple with the rising issue of online survey fraud, it is clear that traditional methods are no longer sufficient to ensure data quality. The convergence of the COVID-19 pandemic, the professionalization of fraud through social media, and the use of modern technologies like AI and VPNs has led to a significant increase in fraudulent survey participation. This surge in sophisticated fraud underscores the urgent need for advanced technological defenses that can adapt and evolve alongside these emerging threats.

However, the effectiveness of these defenses is intricately linked to the design of the survey itself. A well-crafted questionnaire is crucial to unlocking the full potential of technological fraud detection. As we move forward, it is vital for researchers to not only adopt these advanced tools but also to rethink how surveys are designed to maximize their effectiveness. In the upcoming article, we will delve deeper into the strategies for designing questionnaires that can fully leverage these modern fraud detection technologies, providing a roadmap for researchers to combat this growing challenge more effectively.

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

JF

Jennifer Fredrickson

January 14, 2025

Thank you for this! It has been a huge and growing problem. Look forward to the next article, and hope you will include hidden code and AI tools.

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

More from Sebastian Berger

Crafting Questionnaires to Unlock the Full Power of Technological Fraud Detection
Research Methodologies

Crafting Questionnaires to Unlock the Full Power of Technological Fraud Detection

Enhance fraud detection in online surveys with strategic design using open-ended, grid, and trap questions to maximize the effectiveness of advanced t...

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers