Probabilistic Models for Computing Customer-Centric Metrics
Introduction
In this post, I will summarize the common statistical methods to describe and predict customers’ purchase behavior in non-contractual settings. It fits probabilistic models to historical transaction records for computing customer-centric metrics of managerial interest.
Models
Model | Author | Description | R package |
---|---|---|---|
NBD |
Ehrenberg (1959) |
basic benchmark, assumes a heterogenous purchase process, but doesn't account for the possibility of customer defection |
|
Pareto/NBD |
Schmittlein, Morrison, and Colombo (1987) |
combines the NBD model for transactions of active customers with a heterogeneuos dropout process, and to this date can still be considered a gold standard for buy-till-you-die models. |
|
BG/NBD |
(P. Fader, Hardie, and Lee 2005) |
adjusts the Pareto/NBD assumptions with respect to the dropout process in order to speed up computation. However, the BG/NBD model particularly assumes that every customer without a repeat transaction has not defected yet, independent of the elapsed time of inactivity. |
|
MBG/NBD |
Batislam, Denizel, and Filiztekin (2007), Hoppe and Wagner (2007) |
improve BG/NBD by allowing customers without any activity to also remain inactive |
|
BG/CNBD-k |
Reutterer, Platzer, and Schröder (2020) |
extend BG/NBD by allowing for regularity within the transaction timings. If such regularity is present (even in a mild form), these models can yield significant improvements in terms of customer level forecasting accuracy, while the computational costs remain at a similar order of magnitude. |
|
MBG/CNBD-k |
Reutterer, Platzer, and Schröder (2020) |
extend MBG/NBD by allowing for regularity within the transaction timings |
|
Pareto/NBD (HB) |
Ma and Liu (2007) |
it sticks to the original Pareto/NBD assumptions, but using MCMC approach rather than MLE |
|
Pareto/NBD (Abe) |
Abe (2009) |
relaxes the independence of purchase and dropout process, plus is capable of incorporating customer covariates. |
|
Pareto/GGG |
Platzer and Reutterer (2016) |
allows for a varying degree of regularity within the transaction timings. |
|
REMARK:
In practice, the Pareto/NBD model sometimes costs too much computation time if the size of your transaction data is quite large. That’s when BG/NBD comes in. With the limited computing resource, the BG/NBD will be a second choose because of the fast computation. However, the biggest issue of BG/NBD is that the zero-repeaters are always alive. To solve this issue and keep fast computation, the MBG/NBD model is a very useful model dealing with large data set.
# check mbgnbd
?zetaclv::mbgnbd_predict
time-invariant/time-varying model
In the CLVTools
R package 📦, there are more advanced models:
Pareto/NBD model with time-invariant contextual factors (Fader & Hardie 2007)
Pareto/NBD model with time-varying contextual factors (Bachmann, Meierer & Näf 2021)
Standard BG/NBD model (Fader, Hardie, & Lee 2005)
BG/NBD model with time-invariant contextual factors (Fader & Hardie 2007)
Standard Gamma/Gompertz/NBD (Bemmaor & Glady 2012)
Gamma/Gompertz/NBD model with time-invariant contextual factors (Näf, Bachmann & Meierer 2020)
Gamma/Gamma model to estimate customer spending (Colombo & Jiang 1999; Fader, Hardie & Lee 2005; Fader & Hardie 2013)
Reference
Customer Base Analysis with BTYDplus. This is the tutorial for R package BTYDplus 📦. Most of the literature references in the above tables can be found here.