Too Good to be True: How AI is Impacting Data Quality

Picture this, you’re reviewing survey data and reading open-ended responses. You’ve just found the epitome of good open ends: a thorough, insightful answer, no spelling errors, no profanity. It’s perfect.…

by Mary Draper

Vice President, Network Partners & Quality at EMI Research Solutions

But is it… too perfect? This time last year, this probably wasn’t a question we would be asking ourselves.

We have tracked a steady decline in sample quality conversions since November 2022 due to an increase in suspected fraud activity and rising project level data quality removals. It’s probably no coincidence that ChatGPT, an artificial intelligence chatbot developed by OpenAI, launched on November 30th, 2022. ChatGPT exploded into the mainstream practically overnight with over 1 million users in just 5 days. Since then, we have been seeing an increase in suspicious open ends from historically valid panelists using AI to cut corners, more nefarious organizations bypassing fraud detection security systems, and a resurgence of the haunting ghost completes.

The use of artificial intelligence by fraudsters is one of the greatest examples of how fraud evolves over time, always keeping us on our toes. Quality checks that work now or worked in the past are not future proof. The research industry will continue to improve our tools to block fraud but on the other end, fraudsters also become more sophisticated in their ability to break into surveys. It will take a concentrated and collaborative effort across the industry to work against fraud enabled by AI.

While we do see an increase in possible fraud and questionable quality, the heightened awareness of these activities has also caused the everyday data cleaner to be suspicious of everything—not only the responses lacking insights, but also those with too much insight. Where we used to flag on too few words in an open end, we’re now suspicious of too many words, the usage of similar words, responses that have the exact same word count, responses with perfect punctuation, or multiple responses with similar misspellings.

With more thorough data cleaning measures being implemented (as they should be), there is cause for concern about the percentage of honest human responses that are being tossed out of an abundance of caution.

A couple of quick tips to aid in better spotting AI generated responses is to include an open-ended question that asks the opinion or feeling of a respondent. AI currently cannot respond with opinion or feelings on a topic so the responses are a bit easier to identify as they would all be fact driven. We also suggest including copy/paste detection in your survey. Blocking the respondent that attempts to paste an answer is not enough.

There is no good reason for a respondent to copy a response or question, either. Therefore, copying is a flag that can be added as a programmatic quality check as well. When AI is discovered, it is important to replay the details back to the recruitment source so they can take a deeper dive into the panelist’s behaviors.

Some market research organizations have already started acquiring, building, and implementing AI platforms to assist with their research. We anticipate good things as the industry collaborates to better understand AI technology and continues to strive toward quality. Moral of the story: don’t throw the babies out with the bathwater, but make sure the little guy gets a good scrub before we dress him up and send him out into the world.

artificial intelligence data quality survey quality

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Mary Draper

Vice President, Network Partners & Quality at EMI Research Solutions

1 article

author bio

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

ARTICLES

Top in Artificial Intelligence and Machine Learning

The Prompt

Partner Content

Brand Collaboration Is More Than A Logo: What Bridgerton Viewers Taught Us About Brand Partnerships

Discover what makes entertainment brand collaborations succeed using AI-moderated interviews and consumer insights.

Niels Schillewaert

Head of Research and Methodologies at Conveo

July 23, 2026

Read article

The Prompt

How to Trust AI in Research Without Trusting It Too Much

Learn how market researchers can verify AI-generated insights, avoid false confidence, and build trust through calibrated validation.

Ashley Shedlock

Content Producer at Greenbook

July 21, 2026

Read article

CEO Series

Fixing Sample Quality: Adrien Vermeirsch on Fraud, Profiling, and the Future of Human Data

Enlightn CEO Adrien Vermeirsch discusses sample quality, respondent fraud, AI-moderated research, an...

Artificial Intelligence and Machine Learning

Beyond Engagement Metrics: How Market Researchers Can Measure Trust in AI-Generated Insights

Learn how market researchers can measure trust in AI-generated insights through validation, adoption, confidence, and governance metrics.

Ashley Shedlock

Content Producer at Greenbook

July 14, 2026

Read article

Too Good to be True: How AI is Impacting Data Quality

Related

The Sticky Truth About Data Quality: The Invisible Glue Holding Your Insights Together

Brand Collaboration Is More Than A Logo: What Bridgerton Viewers Taught Us About Brand Partnerships

How to Trust AI in Research Without Trusting It Too Much

Fixing Sample Quality: Adrien Vermeirsch on Fraud, Profiling, and the Future of Human Data

Beyond Engagement Metrics: How Market Researchers Can Measure Trust in AI-Generated Insights