varimp {mboost} | R Documentation |
Variable Importance
Description
In-bag risk reduction per base-learner as variable importance for boosting.
Usage
## S3 method for class 'mboost'
varimp(object, ...)
## S3 method for class 'varimp'
plot(x, percent = TRUE, type = c("variable", "blearner"),
blorder = c("importance", "alphabetical", "rev_alphabetical", "formula"),
nbars = 10L, maxchar = 20L, xlab = NULL, ylab = NULL, xlim, auto.key, ...)
## S3 method for class 'varimp'
as.data.frame(x, row.names = NULL, optional = FALSE, ...)
Arguments
object |
an object of class |
x |
an object of class |
percent |
logical, indicating whether variable importance should be specified in percent. |
type |
a character string specifying whether to draw bars for variables
( |
blorder |
a character string specifying the order of the base-learners
in the plot. The default |
nbars |
integer, maximum number of bars to be plotted. If |
maxchar |
integer, maximum number of characters in bar labels. |
xlab |
text for the x-axis label. If not set (default is |
ylab |
text for the y-axis label. If not set (default is |
xlim |
the x limits of the plot. Defaults are from |
auto.key |
logical, or a list passed to |
... |
additional arguments passed to |
row.names |
NULL or a character vector giving the row names for the data frame. Missing values are not allowed. |
optional |
logical. If TRUE, setting row names and converting column names (to syntactic names: see make.names) is optional. |
Details
This function extracts the in-bag risk reductions per boosting step of a
fitted mboost
model and accumulates it individually for each base-learner
contained in the model. This quantifies the individual contribution to risk
reduction of each base-learner and can thus be used to compare the importance
of different base-learners or variables in the model. Starting from offset only,
in each boosting step risk reduction is computed as the difference between
in-bag risk of the current and the previous model and is accounted for the
base-learner selected in the particular step.
The results can be plotted in a bar plot either for the base-learners, or the
variables contained in the model. The bars are ordered according to variable
importance. If their number exceeds nbars
the least important are
summarized as "other". If bars are plotted per variable, all base-learners
containing the same variable will be accumulated in a stacked bar. This is of
use for models including for example seperate base-learners for the linear and
non-linear part of a covariate effect (see ?bbs
option
center=TRUE
). However, variable interactions are treated as individual
variables, as their desired handling might depend on context.
As a comparison the selection frequencies are added to the respective base-learner labels in the plot (rounded to three digits). For stacked bars they are ordered accordingly.
Value
An object of class varimp
with available plot
and
as.data.frame
methods.
Converting a varimp
object results in a data.frame
containing the
risk reductions, selection frequencies and the corresponding base-learner and
variable names as ordered factors
(ordered according to their particular
importance).
Author(s)
Tobias Kuehn (tobi.kuehn@gmx.de), Almond Stoecker (almond.stoecker@gmail.com)
Examples
data(iris)
### glmboost with multiple variables and intercept
iris$setosa <- factor(iris$Species == "setosa")
iris_glm <- glmboost(setosa ~ 1 + Sepal.Width + Sepal.Length + Petal.Width +
Petal.Length,
data = iris, control = boost_control(mstop = 50),
family = Binomial(link = c("logit")))
varimp(iris_glm)
### importance plot with four bars only
plot(varimp(iris_glm), nbars = 4)
### gamboost with multiple variables
iris_gam <- gamboost(Sepal.Width ~
bols(Sepal.Length, by = setosa) +
bbs(Sepal.Length, by = setosa, center = TRUE) +
bols(Petal.Width) +
bbs(Petal.Width, center = TRUE) +
bols(Petal.Length) +
bbs(Petal.Length, center = TRUE),
data = iris)
varimp(iris_gam)
### stacked importance plot with base-learners in rev. alphabetical order
plot(varimp(iris_gam), blorder = "rev_alphabetical")
### similar ggplot
## Not run:
library(ggplot2)
ggplot(data.frame(varimp(iris_gam)), aes(variable, reduction, fill = blearner)) +
geom_bar(stat = "identity") + coord_flip()
## End(Not run)