BT_call {BT} | R Documentation |
(Adaptive) Boosting Trees (ABT/BT) fit.
Description
Fit a (Adaptive) Boosting Trees algorithm. This is for "power" users who have a large number of variables and wish to avoid calling
model.frame
which can be slow in this instance. This function is in particular called by BT
.
It is mainly split in two parts, the first one considers the initialization (see BT_callInit
) whereas the second performs all the boosting iterations (see BT_callBoosting
).
By default, this function does not perform input checks (those are all done in BT
) and all the parameters should be given in the right format. We therefore
suppose that the user is aware of all the choices made.
Usage
BT_call(
training.set,
validation.set,
tweedie.power,
respVar,
w,
explVar,
ABT,
tree.control,
train.fraction,
interaction.depth,
bag.fraction,
shrinkage,
n.iter,
colsample.bytree,
keep.data,
is.verbose
)
BT_callInit(training.set, validation.set, tweedie.power, respVar, w)
BT_callBoosting(
training.set,
validation.set,
tweedie.power,
ABT,
tree.control,
interaction.depth,
bag.fraction,
shrinkage,
n.iter,
colsample.bytree,
train.fraction,
keep.data,
is.verbose,
respVar,
w,
explVar
)
Arguments
training.set |
a data frame containing all the related variables on which one wants to fit the algorithm. |
validation.set |
a held-out data frame containing all the related variables on which one wants to assess the algorithm performance. This can be NULL. |
tweedie.power |
Experimental parameter currently not used - Set to 1 referring to Poisson distribution. |
respVar |
the name of the target/response variable. |
w |
a vector of weights. |
explVar |
a vector containing the name of explanatory variables. |
ABT |
a boolean parameter. If |
tree.control |
allows to define additional tree parameters that will be used at each iteration. See |
train.fraction |
the first |
interaction.depth |
the maximum depth of variable interactions: 1 builds an additive model, 2 builds a model with up to two-way interactions, etc.
This parameter can also be interpreted as the maximum number of non-terminal nodes. By default, it is set to 4.
Please note that if this parameter is |
bag.fraction |
the fraction of independent training observations randomly selected to propose the next tree in the expansion.
This introduces randomness into the model fit. If |
shrinkage |
a shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction. |
n.iter |
the total number of iterations to fit. This is equivalent to the number of trees and the number of basis functions in the additive expansion.
Please note that the initialization is not taken into account in the |
colsample.bytree |
each tree will be trained on a random subset of |
keep.data |
a boolean variable indicating whether to keep the data frames. This is particularly useful if one wants to keep track of the initial data frames
and is further used for predicting in case any data frame is specified.
Note that in case of cross-validation, if |
is.verbose |
if |
Value
a BTFit
object.
Author(s)
Gireg Willame gireg.willame@gmail.com
This package is inspired by the gbm3
package. For more details, see https://github.com/gbm-developers/gbm3/.
References
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |: GLMs and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries ||: Tree-Based Methods and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2019). Effective Statistical Learning Methods for Actuaries |||: Neural Networks and Extensions, Springer Actuarial.
M. Denuit, D. Hainaut and J. Trufin (2022). Response versus gradient boosting trees, GLMs and neural networks under Tweedie loss and log-link. Accepted for publication in Scandinavian Actuarial Journal.
M. Denuit, J. Huyghe and J. Trufin (2022). Boosting cost-complexity pruned trees on Tweedie responses: The ABT machine for insurance ratemaking. Paper submitted for publication.
M. Denuit, J. Trufin and T. Verdebout (2022). Boosting on the responses with Tweedie loss functions. Paper submitted for publication.
See Also
BTFit
, BTCVFit
, BT_perf
, predict.BTFit
,
summary.BTFit
, print.BTFit
, .BT_cv_errors
.