All Text Analysis is Subjective

How to address inconsistencies in text analytics.

All Text Analysis is Subjective

Editor’s Note: This post is part of our Big Ideas series, a column highlighting the innovative thinking and thought leadership at IIeX events around the world. 


Let’s face it – No captured text, be it from a survey form or on social media, can be analyzed with 100% objectivity. Still, it’s obviously useful to analyze text quantitatively and market researchers have used text as input for a long time, due to its versatility and breadth. But we cannot pretend that any Text analysis is free of ambiguity.

Reasons for this uncertainties are

  • The text itself doesn’t contain the full information/context or
  • The person or AI tool analyzing the text is either biased or inconsistent

The Source of Issues

Often, these issues are interconnected and occur together: The lack of context in short texts makes biases in the analysis more apparent. For example, one could understand the statement “Good service” in a Telecommunications context as “Good customer service” or as “Good network service”. A system or a person that would always assign “Good customer service” would be consistent but highly biased, shifting the analysis results in a specific direction, in turn causing the research buyer to think that customer service is more important than network service. Recently, AI-based automated systems have emerged that are at least in principle able to analyze text more consistently as they don’t get tired or distracted.

When evaluating the correctness or the accuracy of such automated systems, market researchers often compare against manual coding which is the current gold standard in text analysis. However, they tend to forget that manual coding is also biased and inconsistent, especially when coders need to keep track of hundreds of codes which sometimes are notoriously difficult/impossible to distinguish. We compared the results from different professional coders with the exact same codebook on the exact same data and found surprisingly low agreement across a variety of studies.

Keeping it Up to Code

In our anecdotal evidence, consistency can be greatly improved by a good and concise codebook. Bias, on the other hand, can be reduced intuitively by letting many different coders work through the same data and then averaging the results. However, this is very tedious and also prohibitively expensive. I would argue that a better, much faster and cheaper option is to use an AI system that learned from as many different manual coders as possible. AI systems are well known to be biased, especially when being trained on a single data source [1] but by learning from a diverse set of coders with different biases, the AI can learn to act as an “average coder”, resulting in an analysis with reduced bias compared to a full analysis with a single coder.

Join our talk at IIeX North America to find out how we compared human coders and different AI-based systems for a large-scale study in Latin America and discuss novel ways to improve quantitative text analysis.

References

1. https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai

big ideas seriescareertext analytics

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Pascal De

Pascal De

1 article

author bio

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

ARTICLES

How to Choose Between Qualitative and Quantitative Testing for Your Research Project
Research Methodologies

How to Choose Between Qualitative and Quantitative Testing for Your Research Project

Learn when to use qualitative vs quantitative research and how modern insights teams combine both for smarter decisions.

Ashley Shedlock

Ashley Shedlock

Content Producer, Editorial & Search Optimization at Greenbook

Back to the Roots: Why Panel Design Matters More than Ever
Research Methodologies

Back to the Roots: Why Panel Design Matters More than Ever

Jennifer Reid argues that strong panel design and persistent identity improve fraud detection and drive more meaningful, trustworthy insights.

Jennifer Reid

Jennifer Reid

Co-CEO and Chief Methodologist at Rival Group

Beyond Churn: A Practical Guide to Learning Customer Retention Research
Research Methodologies

Beyond Churn: A Practical Guide to Learning Customer Retention Research

Discover how to learn retention research using CX, analytics, and AI tools to reduce churn and build loyalty.

Ashley Shedlock

Ashley Shedlock

Content Producer, Editorial & Search Optimization at Greenbook

The Insights Industry Has a Decision Problem — And It’s Costing Companies Millions
Research Methodologies

Partner Content

The Insights Industry Has a Decision Problem — And It’s Costing Companies Millions

Companies don’t lack insights, they lack activation. Discover why valuable research goes unused and how real-time intelligence drives better decisions...

Evan Williams

Evan Williams

Executive Strategy Consultant at Stravito AB

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers