Insights Home All Topics Expert Channels Webinars Podcast

The Sticky Truth About Data Quality: The Invisible Glue Holding Your Insights Together

While data quality has been the topic of much discussion in the market research industry for the past few years, little effort has been made to objectively define the concept.…

by Karine Pepin

Co-Founder at The Research Heads

While data quality has been the topic of much discussion in the market research industry for the past few years, little effort has been made to objectively define the concept. Data quality is a hygiene factor that is often overlooked when present, but becomes noticeably problematic when missing. However, by defining data quality solely according to the absence of outliers, we risk losing sight of what truly makes data beautiful. What if we defined data quality based on what it is, rather than what it is not?

Defining Data Quality Based on What it is Not

Often, the way we define data quality is limited to what it is not, by removing Satisfiers, Speeders, and Straight-liners. How we define these in-survey checks is subjective in nature and whether that practice actually works in improving overall results is questionable.

Picture this: You have just completed a long and arduous research project, and you’re eager to present your findings to your client. However, as you begin to delve into the data, your client starts to notice something troubling: the story doesn’t make sense. You feel your stomach drop as your client raises this concern, asking you to explain what’s going on. You rack your brain for an answer and finally settle on “But…there are no Speeders in our data.” Even as you say it, you realize that this is a poor defense. The absence of Speeders does not make the quality of your data good.

Instead, we ought to focus on defining what qualifies as good data.

The Role of Cohesion in Achieving Data Quality

Let’s take a philosophical step back and consider what makes data beautiful.

At its core, beautiful data makes sense. When we view data quality through this lens, it becomes less subjective than we might think. Data makes sense when the story of each participant is cohesive.

If you’ve seen bad data, you know that participants who cheat in surveys usually answer randomly, and the results are incoherent. For example, Gen Zs buying retirement properties, plumbers performing DNA sequencing, and retirees enrolling in kindergarten classes.

Cohesion doesn’t mean that the findings can’t be surprising; that’s why we do research! But if you were to look at each survey participant in your dataset row by row, you would find that good participants typically remain true to their persona throughout the survey. That’s cohesion.

Another hallmark of good data quality is when open-ended responses are relevant to the question at hand. Open-end responses that are consistent with the rest of the data in terms of themes or patterns further reinforce the cohesiveness of the data. Some might argue that gauging responses this way is also subjective, but the ultimate test is straightforward: Are you comfortable sharing the open-end responses with your client?

Avoiding Confirmation Bias by Developing Tools to Assess Cohesion

Simply removing Satisfiers, Straight-liners, and Speeders is not enough on its own. When we remove participants based on these rules, we simply shoehorn the metrics we have into telling us what we want to see instead of actually determining what we need to know.

To truly achieve good data quality, we need to develop tools that can help us identify a lack of participant-level cohesion. As an example, the Root Likelihood fit score is a great way of improving data quality by identifying participants who may have randomly responded to a choice task, such as a Conjoint exercise. These types of consistency checks are not only better indicators of good-quality data, but they are also less obvious to participants who may become skilled at avoiding the obvious quality assurance traps.

consumer data data quality research bias

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Karine Pepin

Co-Founder at The Research Heads

14 articles

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

Truth To Be Told: Five Realities About Online Sample That Compromise Data Quality

Explore five key truths about sampling, uncovering fraud, low-quality respondents, and transparency issues that have eroded data quality over two deca...

February 25, 2025

Read article

Research Methodologies

From Deliverables to Research Assets: How Insights Teams Can Leverage Content Design Principles for Greater Influence

Learn key principles of content design that enable researchers to distill insights into assets, fostering stakeholder influence and sustainable busine...

August 23, 2024

Read article

The Prompt

The Cost of Being Wrong: How Overconfidence in Ineffective AI Detection Tools Impacts the Research Ecosystem

Discover the challenge of identifying AI-generated open-ended responses and the potential consequences for researchers and the market research industr...

February 16, 2024

Read article

Data Science

Why the Sampling Ecosystem Sets Up Honest Participants for Failure

This article discusses how the online sampling ecosystem favors professional respondents and bad actors. It advocates for a transformative shift towar...

October 9, 2023

Read article

See all articles

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers

Get the latest updates from top market research, insights, and analytics experts delivered weekly to your inbox

Your guide for all things market research, Insights, and analytics

The Sticky Truth About Data Quality: The Invisible Glue Holding Your Insights Together

Defining Data Quality Based on What it is Not

The Role of Cohesion in Achieving Data Quality

Related

Ensuring Data Quality Through Survey Design

Avoiding Confirmation Bias by Developing Tools to Assess Cohesion

Truth To Be Told: Five Realities About Online Sample That Compromise Data Quality

From Deliverables to Research Assets: How Insights Teams Can Leverage Content Design Principles for Greater Influence

The Cost of Being Wrong: How Overconfidence in Ineffective AI Detection Tools Impacts the Research Ecosystem

Why the Sampling Ecosystem Sets Up Honest Participants for Failure