Calculate customer lifetime value using three industry-standard approaches.
Customer Lifetime Value (CLTV) Explorer
Understand the CLTV analysis pipeline — click any step to learn more about the methodology.
1. Data Prep
Transaction data cleaned with outlier removal (top 1% spenders) to prevent model skewing.
Why remove outliers for CLV? A single whale customer who spends 100× the median will massively skew averages and distort cluster boundaries. Your model ends up optimizing for unicorns instead of the 99% of customers you can actually influence.
Common approaches: Percentile capping (cap at the 99th percentile), IQR method (remove values beyond Q3 + 1.5 × IQR), log transformation (
Common approaches: Percentile capping (cap at the 99th percentile), IQR method (remove values beyond Q3 + 1.5 × IQR), log transformation (
log(value + 1) to compress scale), or Winsorization (replace extremes with nearest non-extreme value).
2. RFM Analysis
Calculate Recency, Frequency, and Monetary value to profile customer behavior.
1. Calculate raw values per customer: Recency (R) = days since last purchase. Frequency (F) = total transactions in a period. Monetary (M) = average spend per transaction.
2. Score each dimension 1–5 using quintiles. For Recency, lower is better (more recent = higher score). For F and M, higher is better.
3. Combine into segments: Concatenate scores (e.g., R=5, F=4, M=3 → “543”) or sum for a composite score (12/15). Typical segments: “Champions” (555), “At Risk” (155), “Lost” (111).
2. Score each dimension 1–5 using quintiles. For Recency, lower is better (more recent = higher score). For F and M, higher is better.
3. Combine into segments: Concatenate scores (e.g., R=5, F=4, M=3 → “543”) or sum for a composite score (12/15). Typical segments: “Champions” (555), “At Risk” (155), “Lost” (111).
3. Clustering
Group customers into value segments using K-Means on RFM scores.
K-Means groups data points into K clusters by minimizing the distance between each point and its cluster center (centroid). Algorithm: Choose K (K=3 for Low/Mid/High is a good start), randomly place K centroids, assign each customer to nearest centroid, recalculate centroids as cluster means, repeat until stable.
Tips: Standardize RFM values first (z-score normalization). Use the elbow method or silhouette score to validate K. In production, K=3 to K=5 is usually sufficient.
Tips: Standardize RFM values first (z-score normalization). Use the elbow method or silhouette score to validate K. In production, K=3 to K=5 is usually sufficient.
4. Prediction
Random Forest classifier predicts segment membership for new customers.
A Random Forest is an ensemble of decision trees that vote on the outcome. Build N trees (100–500), each on a bootstrapped sample. At each split, only consider a random subset of features. For a new customer, all trees vote; majority wins. Confidence = percentage of trees agreeing.
Why it works for CLV: It handles non-linear relationships, is insensitive to feature scaling, and gives built-in feature importance rankings.
Why it works for CLV: It handles non-linear relationships, is insensitive to feature scaling, and gives built-in feature importance rankings.
CLTV Formula Calculator
Three industry-standard approaches to calculating customer lifetime value. Pick the one that fits your business model.
CLTV = Average Order Value × Purchase Frequency × Customer Lifespan
Results
Score each dimension 1–5 based on your customer data. The tool calculates a composite RFM score and segments into Low/Mid/High value with transparent thresholds.
RFM Score = R + F + M
Thresholds: 3–7 = Low | 8–11 = Mid | 12–15 = High
5 = purchased very recently, 1 = long ago
5 = purchases very often, 1 = rarely
5 = highest spender, 1 = lowest
Results
CLTV = (ARPU × Gross Margin %) ÷ Churn Rate %
Average Revenue Per User per month
Percentage of customers lost per month