{CLVTools}R Documentation

Plot Diagnostics for the Transaction data in a Object


Depending on the value of parameter which, one of the following plots will be produced. Note that the sample parameter determines the period for which the selected plot is made (either estimation, holdout, or full).

Tracking Plot

Plot the aggregated repeat transactions per period over the given time-horizon (prediction.end). See Details for the definition of plotting periods.

Frequency Plot

Plot the distribution of transactions or repeat transactions per customer, after aggregating transactions of the same customer on a single time point. Note that if trans.bins is changed, label.remaining usually needs to be adapted as well.

Spending Plot

Plot the empirical density of either customer's average spending per transaction or the value of every transaction in the data, after aggregating transactions of the same customer on a single time point. Note that in all cases this includes all transactions and not only repeat-transactions.

Interpurchase Time Plot

Plot the empirical density of customer's mean time (in number of periods) between transactions, after aggregating transactions of the same customer on a single time point. Note that customers without repeat-transactions are removed.


## S3 method for class ''
  which = c("tracking", "frequency", "spending", "interpurchasetime"),
  prediction.end = NULL,
  cumulative = FALSE,
  trans.bins = 0:9,
  count.repeat.trans = TRUE,
  count.remaining = TRUE,
  label.remaining = "10+",
  mean.spending = TRUE,
  sample = c("estimation", "full", "holdout"),
  geom = "line",
  color = "black",
  plot = TRUE,
  verbose = TRUE,



The object to plot


Which plot to produce, either "tracking", "frequency", "spending" or "interpurchasetime". May be abbreviated but only one may be selected. Defaults to "tracking".


"tracking": Until what point in time to plot. This can be the number of periods (numeric) or a form of date/time object. See details.


"tracking": Whether the cumulative actual repeat transactions should be plotted.


"frequency": Vector of integers indicating the number of transactions (x axis) for which the customers should be counted.


"frequency": Whether repeat transactions (TRUE, default) or all transactions (FALSE) should be counted.


"frequency": Whether the customers which are not captured with trans.bins should be counted in a separate last bar.


"frequency": Label for the last bar, if count.remaining=TRUE.


"spending": Whether customer's mean spending per transaction (TRUE, default) or the value of every transaction in the data (FALSE) should be plotted.


Name of the sample for which the plot should be made, either "estimation", "full", or "holdout". Defaults to "estimation". Not for "tracking".


The geometric object of ggplot2 to display the data. Forwarded to ggplot2::stat_density. Not for "tracking" and "frequency".


Color of resulting geom object in the plot. Not for "tracking".


Whether a plot should be created or only the assembled data returned.


Show details about the running of the function.


Forwarded to ggplot2::stat_density ("spending", "interpurchasetime") or ggplot2::geom_bar ("frequency"). Not for "tracking".


prediction.end indicates until when to predict or plot and can be given as either a point in time (of class Date, POSIXct, or character) or the number of periods. If prediction.end is of class character, the date/time format set when creating the data object is used for parsing. If prediction.end is the number of periods, the end of the fitting period serves as the reference point from which periods are counted. Only full periods may be specified. If prediction.end is omitted or NULL, it defaults to the end of the holdout period if present and to the end of the estimation period otherwise.

The first prediction period is defined to start right after the end of the estimation period. If for example weekly time units are used and the estimation period ends on Sunday 2019-01-01, then the first day of the first prediction period is Monday 2019-01-02. Each prediction period includes a total of 7 days and the first prediction period therefore will end on, and include, Sunday 2019-01-08. Subsequent prediction periods again start on Mondays and end on Sundays. If prediction.end indicates a timepoint on which to end, this timepoint is included in the prediction period.

If there are no repeat transactions until prediction.end, only the time for which there is data is plotted. If the data is returned (i.e. with argument plot=FALSE), the respective rows contain NA in column Number of Repeat Transactions.


An object of class ggplot from package ggplot2 is returned by default. If plot=FALSE, the data that would have been used to create the plot is returned. Depending on which plot was selected, this is a data.table which contains some of the following columns:


Customer Id


The timepoint that marks the end (up until and including) of the period to which the data in this row refers.

Number of Repeat Transactions

The number of actual repeat transactions in the period that ends at period.until.


Spending as defined by parameter mean.spending.


Mean number of periods between transactions per customer, excluding customers with no repeat-transactions.


The number of (repeat) transactions, depending on count.repeat.trans.


The number of customers.

See Also

ggplot2::stat_density and ggplot2::geom_bar for possible arguments to ...

plot to plot fitted transaction models

plot to plot fitted spending models


clv.cdnow <- clvdata(cdnow, time.unit="w",estimation.split=37,

# Plot the actual repeat transactions
# same, explicitly
plot(clv.cdnow, which="tracking")

# plot cumulative repeat transactions
plot(clv.cdnow, cumulative=TRUE)

# Dont automatically plot but tweak further
library(ggplot2) # for ggtitle()
gg.cdnow <- plot(clv.cdnow)
# change Title
gg.cdnow + ggtitle("CDnow repeat transactions")

# Dont return a plot but only the data from
#   which it would have been created <- plot(clv.cdnow, plot=FALSE)

plot(clv.cdnow, which="frequency")

# Bins from 0 to 15, all remaining in bin labelled "16+"
plot(clv.cdnow, which="frequency", trans.bins=0:15,

# Count all transactions, not only repeat
#  Note that the bins have to be adapted to start from 1
plot(clv.cdnow, which="frequency", count.repeat.trans = FALSE,

# plot customer's average transaction value
plot(clv.cdnow, which="spending", mean.spending = TRUE)

# distribution of the values of every transaction
plot(clv.cdnow, which="spending", mean.spending = FALSE)

# plot as small points, in blue
plot(clv.cdnow, which="interpurchasetime",
     geom="point", color="blue", size=0.02)

[Package CLVTools version 0.9.0 Index]