Loyalty — Heckman Correction

The gold standard for addressing non-random selection bias — explicitly modeling the decision to join as a separate process from spending behavior.

Heckman Two-Step Correction

The Heckman correction (or “heckit”) models why customers choose to join, then uses that model to correct the spending analysis. This isolates the true causal effect of program membership.

1 Selection Equation

A probit model that includes the entire population (both members and non-members) to determine who is likely to choose to join the rewards program.

Predicts probability of joining based on observable characteristics
Variables include demographics, initial order type, location
Captures the “selection mechanism” that creates bias
Must include at least one variable that affects joining but not spending directly

2 Outcome Equation

An Ordinary Least Squares (OLS) model that measures actual spending outcomes, corrected for selection bias.

Models spending as a function of membership and other factors
Includes the Inverse Mills Ratio from Step 1
Isolates the true causal effect of program membership
Provides unbiased estimates of program lift

3 Inverse Mills Ratio

The Inverse Mills Ratio (λ) is the key innovation — a correction factor that captures the bias from non-random selection.

Calculated from the selection equation’s predicted probabilities
Added as a regressor in the outcome equation
Absorbs the correlation between joining tendency and spending
If λ is statistically significant, selection bias exists

λ = φ(Zγ) / Φ(Zγ)

Where φ = standard normal PDF, Φ = standard normal CDF, Z = selection variables

Heckman Two-Step Process

STEP 1

Probit Model

→

EXTRACT

Mills Ratio (λ)

→

STEP 2

OLS + λ

→

RESULT

True Lift