explain {pre} | R Documentation |
Explain predictions from final prediction rule ensemble
Description
explain
shows which rules apply to which observations and visualizes
the contribution of rules and linear predictors to the predicted values
Usage
explain(
object,
newdata,
penalty.par.val = "lambda.1se",
response = 1L,
plot = TRUE,
intercept = FALSE,
center.linear = FALSE,
plot.max.nobs = 4,
plot.dim = c(2, 2),
plot.obs.names = TRUE,
pred.type = "response",
digits = 3L,
cex = 0.8,
ylab = "Contribution to linear predictor",
bar.col = c("#E495A5", "#39BEB1"),
rule.col = "darkgrey",
...
)
Arguments
object |
object of class |
newdata |
optional dataframe of new (test) observations, including all predictor variables used for deriving the prediction rule ensemble. |
penalty.par.val |
character or numeric. Value of the penalty parameter
|
response |
numeric or character vector of length one. Specifies the
name or number of the response variable (for multivariate responses) or
the name or number of the factor level (for multinomial responses) for
which explanations and contributions should be computed and/or plotted.
Only used for |
plot |
logical. Should explanations be plotted? |
intercept |
logical. Specifies whether intercept should be included in explaining predictions. |
center.linear |
logical. Specifies whether linear terms should be
centered with respect to the training sample mean before computing their
contribution to the predicted value. If |
plot.max.nobs |
numeric. Specifies maximum number of observations
for which explanations will be plotted. The default ( |
plot.dim |
numeric vector of length 2. Specifies the number of rows and columns in the resulting plot. |
plot.obs.names |
logical vector of length 1, NULL, or character vector
of length |
pred.type |
character. Specifies the type of predicted values to be computed, returned and provided in the plot(s). Note that the computed contributions must be additive and are therefore always on the scale of the linear predictor. |
digits |
integer. Specifies the number of digits used in depcting the predicted values in the plot. |
cex |
numeric. Specifies the relative text size of title, tick and axis labels. |
ylab |
character. Specifies the label for the horizonantal (y-) axis. |
bar.col |
character vector of length two. Specifies the colors to be used for plotting the positive and negative contributions to the predictions, respectively. |
rule.col |
character. Specifies the color to be used for plotting the rule
descriptions. If |
... |
Further arguments to be passed to |
Details
Provides a graphical depiction of the contribution of rules and
linear terms to the individual predictions (if plot = TRUE
.
Invisibly returns a list with objects predictors
and
contribution
. predictors
contains the values of the rules and
linear terms for each observation in newdata
, for those rules
and linear terms included in the final ensemble with the specified
value of penalty.par.val
. contribution
contains the
values of predictors
, multiplied by the estimated values
of the coefficients in the final ensemble selected with the
specified value of penalty.par.val
.
All contributions are calculated w.r.t. the intercept, by default.
Thus, if a given rule applies to an observation in newdata
,
the contribution of that rule equals the estimated coefficient of
that rule. If a given rule does not apply to an observation in
newdata
, the contribution of that rule equals 0.
For linear terms, contributions can be centered, or not (the default).
Thus, by default the contribution of a linear terms for an
observation in newdata
equals the obeservation's value of the
linear term, times the estimated coefficient of the linear term.
If center.linear = TRUE
, the contribution of a linear term
for an observation in newdata
equals (the value of the linear
temr, minus the mean value of the linear term in the training data)
times the estimated coefficient for the linear term.
References
Fokkema, M. & Strobl, C. (2020). Fitting prediction rule ensembles to psychological research data: An introduction and tutorial. Psychological Methods 25(5), 636-652. doi:10.1037/met0000256, https://arxiv.org/abs/1907.05302
See Also
pre
, plot.pre
,
coef.pre
, importance.pre
, cvpre
,
interact
, print.pre
Examples
airq <- airquality[complete.cases(airquality), ]
set.seed(1)
train <- sample(1:nrow(airq), size = 100)
set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airq[train,])
airq.ens.exp <- explain(airq.ens, newdata = airq[-train,])
airq.ens.exp$predictors
airq.ens.exp$contribution
## Can also include intercept in explanation:
airq.ens.exp <- explain(airq.ens, newdata = airq[-train,])
## Fit PRE with linear terms only to illustrate effect of center.linear:
set.seed(42)
airq.ens2 <- pre(Ozone ~ ., data = airq[train,], type = "linear")
## When not centered around their means, Month has negative and
## Day has positive contribution:
explain(airq.ens2, newdata = airq[-train,][1:2,],
penalty.par.val = "lambda.min")$contribution
## After mean centering, contributions of Month and Day have switched
## sign (for these two observations):
explain(airq.ens2, newdata = airq[-train,][1:2,],
penalty.par.val = "lambda.min", center.linear = TRUE)$contribution