LevelUP Your Research

February 8, 2022

Why I Love the Beta Distribution (Part One)

The Beta distribution should be in every market researcher’s toolbox.

Why I Love the Beta Distribution (Part One)
Joel Rubinson

by Joel Rubinson

President at Rubinson Partners Inc

This article is the first in a series on the topic of Beta distribution.

All marketing researchers are familiar with the normal distribution (and its closely related Student’s t-distribution): that bell-shaped curve that we base our stat testing and so many other analytics on.

But the most important distribution for delivering insights about choice behaviors and predicting ad responsiveness might be something else: the Beta distribution – and you should make good friends with it.

First, some things to know:

  • The Beta distribution is the only probability distribution of probabilities! It is bounded between 0 and 1 (like probabilities are).
  • The Beta can take on almost any shape. It can look like a bell curve but it can also be U-shaped (depending on its two parameters, alpha and beta).

 

Distribution of consumer preferences towards a given brand follows a Beta distribution

As it turns out, usually the U-shape is a better description of consumers’ distribution of probabilities of choosing a given brand.

Based on my testing it against Numerator receipt-scanning data on 45 brands from seven different CPG categories, the correlation to actual data was 99% (not a typo!).

The Beta has natural marketing interpretations too. Alpha divided by (alpha + beta), that is α/(α+β), is the expected share of next purchase events for the brand of interest by all category purchasers. The purchase to purchase repeat rate, my favorite measure of brand loyalty, is directly calculated as (α+1)/(α+β+1).

If you have a 10% share brand, you know the ratio of the parameters but the sum of the parameters defines the loyalty towards the brand. So, alpha =10 and beta = 90 would give you a 10 share with basically no brand loyalty while alpha = .1 and beta = .9 would give you the same share but with a repeat rate over 50% (relatively high loyalty). For the 45 brands I mentioned modeling,  α+β tended to add to around 1.5.

 

Heterogeneity is the right way to think about consumers

Related

Why Targeting Eats Reach-Based Media Strategies for Lunch

When you fit a Beta distribution to consumer purchase data, it will establish a mental model for you that consumers are heterogeneous in their purchase probabilities towards your brand and the great majority of consumers have no interest in your brand (unless you have a share like Tide or Coca-Cola).

This immediately should lead you to question the idea of reach-based marketing, where you avoid targeting and everyone with a mouth is in your universe. In fact, I have proven, published in a white paper, that the Movable Middle (those with a 20-80% probability of choosing your brand) is five times more responsive to your advertising! This size of the Movable Middle and this finding come directly from the application of the Beta distribution.

 

Brand tracking and brand equity research

Great new insights and value will come from your tracker when you use “Beta distribution” thinking. Ask constant sum of respondents and model their probability of buying each brand in the category. Fit a Beta distribution to each brand and you will see what each respondent is loyal to, what the co-loyalties are, and what the market structure might be (covariance of probabilities towards similar brands). Furthermore, if you want to see what a brand’s strengths are perceptually, look at its attribute ratings among those with a 50%+ probability of choosing the brand compared to other brands’ 50% + consumers. In addition, this will be your key analysis for brand health. If your 50%+ consumers don’t think as highly of you as those loyal to your competitors, you are in deep trouble for the future.

 

Relationship of Beta distribution to the Dirichlet distribution.

Some of you analytics techies out there might have tried to apply the Dirichlet distribution, especially favored by the Ehrenberg-Bass Institute. Why not? Wikipedia tells us the Dirichlet is a multivariate version of the Beta. Actually, this is not true! The Dirichlet makes certain overly-restrictive assumptions, like there is no such thing as market structure (go tell that to a beer company). In my humble opinion, you are much better off using a series of Beta distributions. The covariance of loyalties will come from your tracker using constant sum and be analyzed as I mentioned. And you will learn a lot from this! (If you want a true multivariate Beta model, you need to use marginal betas for each brand and then link them by something called a Copula which captures correlations across marginal distributions; no restrictive assumptions there!)

I could do a half-day seminar on this! (Hint, hint 😊)

So, deep insights into consumer preferences and behaviors are one reason to celebrate and expand the use of the Beta. In part two, I’ll give you one or two more important applications.

brand trackingconsumer researchdata analytics

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

More from Joel Rubinson

The Paradox of the Paradox of Choice
LevelUP Your Research

The Paradox of the Paradox of Choice

Discover how to navigate consumer choices effectively. Learn to leverage behavioral cues and refine ad targeting to enhance brand visibility and drive...

How to Improve Ad Attentiveness Measurement
LevelUP Your Research

How to Improve Ad Attentiveness Measurement

Explore the hierarchy of advertising effects, from impressions to sales. Discover how consumer attentiveness and relevance drive effective marketing s...

Are You Using Synthetic Data for Analytics?
LevelUP Your Research

Are You Using Synthetic Data for Analytics?

Explore the use of synthetic data to bridge the gap between sales and ad exposure data. Learn how it can enhance targeting and validate ad effectivene...

How Baseball Led Me to Marketing Analytics
LevelUP Your Research

How Baseball Led Me to Marketing Analytics

Discover the power of Moneyballing marketing and revolutionize your outcomes by leveraging math over judgment for superior results.

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers