fvcm {vcrpart} | R Documentation |
Bagging and Random Forests based on tvcm
Description
Bagging (Breiman, 1996) and Random Forest (Breiman, 2001) ensemble
algorithms for tvcm
.
Usage
fvcm(..., control = fvcm_control())
fvcm_control(maxstep = 10, minsize = 10,
folds = folds_control("subsampling", K = 100),
mtry = 5, sctest = FALSE, alpha = 1.0,
mindev = 0.0, verbose = TRUE, ...)
fvcolmm(..., family = cumulative(), control = fvcolmm_control())
fvcolmm_control(maxstep = 10, minsize = 20,
folds = folds_control("subsampling", K = 100),
mtry = 5, sctest = TRUE, alpha = 1.0,
nimpute = 1, verbose = TRUE, ...)
fvcglm(..., family, control = fvcglm_control())
fvcglm_control(maxstep = 10, minsize = 10,
folds = folds_control("subsampling", K = 100),
mtry = 5, mindev = 0,
verbose = TRUE, ...)
Arguments
... |
for |
.
control |
a list of control parameters as produced by
|
family |
the model family, e.g., |
maxstep |
integer. The maximum number of steps for when growing individual trees. |
folds |
a list of parameters to control the extraction of subsets,
as created by |
mtry |
positive integer scalar. The number of combinations of partitions, nodes and variables to be randomly sampled as candidates in each iteration. |
sctest |
logical scalar. Defines whether coefficient constancy tests should be used for the variable and node selection in each iteration. |
mindev , alpha |
these parameters are merely specified to
disable the default stopping rules for |
minsize , nimpute |
special parameter settings for
|
verbose |
logical. Should information about the fitting process be printed to the screen? |
Details
Implements the Bagging (Breiman, 1996) and Random
Forests (Breiman, 2001) ensemble algorithms for
tvcm
. The method consist in growing multiple trees by
using tvcm
and aggregating the fitted coefficient
functions in the scale of the predictor function. To enable bagging,
use mtry = Inf
in fvcm_control
.
fvcolmm
and fvcglm
are the
extensions for tvcolmm
and
tvcglm
.
fvcm_control
is a wrapper of
tvcm_control
and the arguments indicated specify
modified defaults and parameters for randomizing split
selections. Notice that, relative to tvcm_control
,
also the cv
prune
arguments are internally disabled. The
default arguments for alpha
and maxoverstep
essentially
disable the stopping rules of tvcm
, where the
argument maxstep
(the number of iterations i.e. the maximum
number of splits) fully controls the stopping. The parameter
mtry
controls the randomization for selecting combinations of
partitions, nodes and variables for splitting. The default of
mtry = 5
is arbitrary.
Value
An object of class fvcm
.
Author(s)
Reto Burgin
References
Breiman, L. (1996). Bagging Predictors. Machine Learning, 24(2), 123–140.
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32.
Hastie, T., R. Tibshirani and J. Friedman (2001). The Elements of Statistical Learning (2 ed.). New York, USA: Springer-Verlag.
Burgin, R. A. (2015). Tree-based methods for moderated regression with application to longitudinal data. PhD thesis. University of Geneva.
See Also
fvcm-methods
, tvcm
,
glm
, olmm
Examples
## ------------------------------------------------------------------- #
## Dummy example:
##
## Bagging 'tvcm' on the artificially generated data 'vcrpart_3'. The
## true coefficient function is a sinus curve between -pi/2 and pi/2.
## The parameters 'maxstep = 3' and 'K = 5' are chosen to restrict the
## computations.
## ------------------------------------------------------------------- #
## simulated data
data(vcrpart_3)
## setting parameters
control <-
fvcm_control(maxstep = 3,
folds = folds_control("subsampling", K = 5, 0.5, seed = 3))
## fitting the forest
model <- fvcm(y ~ vc(z1, by = x1), data = vcrpart_3,
family = gaussian(), control = control)
## plot the first two trees
plot(model, "coef", 1:2)
## plotting the partial dependency of the coefficient for 'x1'
plot(model, "partdep")