pnbd {CLVTools} | R Documentation |
Pareto/NBD models
Description
Fits Pareto/NBD models on transactional data with and without covariates.
Usage
## S4 method for signature 'clv.data'
pnbd(
clv.data,
start.params.model = c(),
use.cor = FALSE,
start.param.cor = c(),
optimx.args = list(),
verbose = TRUE,
...
)
## S4 method for signature 'clv.data.static.covariates'
pnbd(
clv.data,
start.params.model = c(),
use.cor = FALSE,
start.param.cor = c(),
optimx.args = list(),
verbose = TRUE,
names.cov.life = c(),
names.cov.trans = c(),
start.params.life = c(),
start.params.trans = c(),
names.cov.constr = c(),
start.params.constr = c(),
reg.lambdas = c(),
...
)
## S4 method for signature 'clv.data.dynamic.covariates'
pnbd(
clv.data,
start.params.model = c(),
use.cor = FALSE,
start.param.cor = c(),
optimx.args = list(),
verbose = TRUE,
names.cov.life = c(),
names.cov.trans = c(),
start.params.life = c(),
start.params.trans = c(),
names.cov.constr = c(),
start.params.constr = c(),
reg.lambdas = c(),
...
)
Arguments
clv.data |
The data object on which the model is fitted. |
start.params.model |
Named start parameters containing the optimization start parameters for the model without covariates. |
use.cor |
Whether the correlation between the transaction and lifetime process should be estimated. |
start.param.cor |
Start parameter for the optimization of the correlation. |
optimx.args |
Additional arguments to control the optimization which are forwarded to |
verbose |
Show details about the running of the function. |
... |
Ignored |
names.cov.life |
Which of the set Lifetime covariates should be used. Missing parameter indicates all covariates shall be used. |
names.cov.trans |
Which of the set Transaction covariates should be used. Missing parameter indicates all covariates shall be used. |
start.params.life |
Named start parameters containing the optimization start parameters for all lifetime covariates. |
start.params.trans |
Named start parameters containing the optimization start parameters for all transaction covariates. |
names.cov.constr |
Which covariates should be forced to use the same parameters for the lifetime and transaction process. The covariates need to be present as both, lifetime and transaction covariates. |
start.params.constr |
Named start parameters containing the optimization start parameters for the constraint covariates. |
reg.lambdas |
Named lambda parameters used for the L2 regularization of the lifetime and the transaction covariate parameters. Lambdas have to be >= 0. |
Details
Model parameters for the Pareto/NBD model are alpha, r, beta, and s
.
s
: shape parameter of the Gamma distribution for the lifetime process.
The smaller s, the stronger the heterogeneity of customer lifetimes.
beta
: rate parameter for the Gamma distribution for the lifetime process.
r
: shape parameter of the Gamma distribution of the purchase process.
The smaller r, the stronger the heterogeneity of the purchase process.
alpha
: rate parameter of the Gamma distribution of the purchase process.
Based on these parameters, the average purchase rate while customers are active is r/alpha and the average dropout rate is s/beta.
Ideally, the starting parameters for r and s represent your best guess concerning the heterogeneity of customers in their buy and die rate. If covariates are included into the model additionally parameters for the covariates affecting the attrition and the purchase process are part of the model.
If no start parameters are given, 1.0 is used for all model parameters and 0.1 for covariate parameters. The model start parameters are required to be > 0.
The Pareto/NBD model
The Pareto/NBD is the first model addressing the issue of modeling customer purchases and
attrition simultaneously for non-contractual settings. The model uses a Pareto distribution,
a combination of an Exponential and a Gamma distribution, to explicitly model customers'
(unobserved) attrition behavior in addition to customers' purchase process.
In general, the Pareto/NBD model consist of two parts. A first process models the purchase
behavior of customers as long as the customers are active. A second process models customers'
attrition. Customers live (and buy) for a certain unknown time until they become inactive
and "die". Customer attrition is unobserved. Inactive customers may not be reactivated.
For technical details we refer to the original paper by Schmittlein, Morrison and Colombo
(1987) and the detailed technical note of Fader and Hardie (2005).
Pareto/NBD model with static covariates
The standard Pareto/NBD model captures heterogeneity was solely using Gamma distributions. However, often exogenous knowledge, such as for example customer demographics, is available. The supplementary knowledge may explain part of the heterogeneity among the customers and therefore increase the predictive accuracy of the model. In addition, we can rely on these parameter estimates for inference, i.e. identify and quantify effects of contextual factors on the two underlying purchase and attrition processes. For technical details we refer to the technical note by Fader and Hardie (2007).
Pareto/NBD model with dynamic covariates
In many real-world applications customer purchase and attrition behavior may be influenced by covariates that vary over time. In consequence, the timing of a purchase and the corresponding value of at covariate a that time becomes relevant. Time-varying covariates can affect customer on aggregated level as well as on an individual level: In the first case, all customers are affected simultaneously, in the latter case a covariate is only relevant for a particular customer. For technical details we refer to the paper by Bachmann, Meierer and Näf (2020).
Value
Depending on the data object on which the model was fit, pnbd
returns either an object of
class clv.pnbd, clv.pnbd.static.cov, or clv.pnbd.dynamic.cov.
The function summary
can be used to obtain and print a summary of the results.
The generic accessor functions coefficients
, vcov
, fitted
,
logLik
, AIC
, BIC
, and nobs
are available.
Note
The Pareto/NBD model with dynamic covariates can currently not be fit with data that has a temporal resolution
of less than one day (data that was built with time unit hours
).
References
Schmittlein DC, Morrison DG, Colombo R (1987). “Counting Your Customers: Who-Are They and What Will They Do Next?” Management Science, 33(1), 1-24.
Bachmann P, Meierer M, Naef, J (2021). “The Role of Time-Varying Contextual Factors in Latent Attrition Models for Customer Base Analysis” Marketing Science 40(4). 783-809.
Fader PS, Hardie BGS (2005). “A Note on Deriving the Pareto/NBD Model and Related Expressions.” URL http://www.brucehardie.com/notes/009/pareto_nbd_derivations_2005-11-05.pdf.
Fader PS, Hardie BGS (2007). “Incorporating time-invariant covariates into the Pareto/NBD and BG/NBD models.” URL http://www.brucehardie.com/notes/019/time_invariant_covariates.pdf.
Fader PS, Hardie BGS (2020). “Deriving an Expression for P(X(t)=x) Under the Pareto/NBD Model.” URL https://www.brucehardie.com/notes/012/pareto_NBD_pmf_derivation_rev.pdf
See Also
clvdata
to create a clv data object, SetStaticCovariates
to add static covariates to an existing clv data object.
gg to fit customer's average spending per transaction with the Gamma-Gamma
model
predict
to predict expected transactions, probability of being alive, and customer lifetime value for every customer
plot
to plot the unconditional expectation as predicted by the fitted model
pmf
for the probability to make exactly x transactions in the estimation period, given by the probability mass function (PMF).
The generic functions vcov
, summary
, fitted
.
SetDynamicCovariates
to add dynamic covariates on which the pnbd
model can be fit.
Examples
data("apparelTrans")
clv.data.apparel <- clvdata(apparelTrans, date.format = "ymd",
time.unit = "w", estimation.split = 40)
# Fit standard pnbd model
pnbd(clv.data.apparel)
# Give initial guesses for the model parameters
pnbd(clv.data.apparel,
start.params.model = c(r=0.5, alpha=15, s=0.5, beta=10))
# pass additional parameters to the optimizer (optimx)
# Use Nelder-Mead as optimization method and print
# detailed information about the optimization process
apparel.pnbd <- pnbd(clv.data.apparel,
optimx.args = list(method="Nelder-Mead",
control=list(trace=6)))
# estimated coefs
coef(apparel.pnbd)
# summary of the fitted model
summary(apparel.pnbd)
# predict CLV etc for holdout period
predict(apparel.pnbd)
# predict CLV etc for the next 15 periods
predict(apparel.pnbd, prediction.end = 15)
# Estimate correlation as well
pnbd(clv.data.apparel, use.cor = TRUE)
# To estimate the pnbd model with static covariates,
# add static covariates to the data
data("apparelStaticCov")
clv.data.static.cov <-
SetStaticCovariates(clv.data.apparel,
data.cov.life = apparelStaticCov,
names.cov.life = c("Gender", "Channel"),
data.cov.trans = apparelStaticCov,
names.cov.trans = c("Gender", "Channel"))
# Fit pnbd with static covariates
pnbd(clv.data.static.cov)
# Give initial guesses for both covariate parameters
pnbd(clv.data.static.cov, start.params.trans = c(Gender=0.75, Channel=0.7),
start.params.life = c(Gender=0.5, Channel=0.5))
# Use regularization
pnbd(clv.data.static.cov, reg.lambdas = c(trans = 5, life=5))
# Force the same coefficient to be used for both covariates
pnbd(clv.data.static.cov, names.cov.constr = "Gender",
start.params.constr = c(Gender=0.5))
# Fit model only with the Channel covariate for life but
# keep all trans covariates as is
pnbd(clv.data.static.cov, names.cov.life = c("Channel"))
# Add dynamic covariates data to the data object
# add dynamic covariates to the data
## Not run:
data("apparelDynCov")
clv.data.dyn.cov <-
SetDynamicCovariates(clv.data = clv.data.apparel,
data.cov.life = apparelDynCov,
data.cov.trans = apparelDynCov,
names.cov.life = c("Marketing", "Gender", "Channel"),
names.cov.trans = c("Marketing", "Gender", "Channel"),
name.date = "Cov.Date")
# Fit PNBD with dynamic covariates
pnbd(clv.data.dyn.cov)
# The same fitting options as for the
# static covariate are available
pnbd(clv.data.dyn.cov, reg.lambdas = c(trans=10, life=2))
## End(Not run)