makeStackedLearner {mlr} | R Documentation |
Create a stacked learner object.
Description
A stacked learner uses predictions of several base learners and
fits a super learner using these predictions as features in order to
predict the outcome. The following stacking methods are available:
-
average
Averaging of base learner predictions without weights.
-
stack.nocv
Fits the super learner, where in-sample predictions of
the base learners are used.
-
stack.cv
Fits the super learner, where the base learner predictions
are computed by cross-validated predictions (the resampling strategy can be
set via the resampling
argument).
-
hill.climb
Select a subset of base learner predictions by hill
climbing algorithm.
-
compress
Train a neural network to compress the model from a
collection of base learners.
Usage
makeStackedLearner(
base.learners,
super.learner = NULL,
predict.type = NULL,
method = "stack.nocv",
use.feat = FALSE,
resampling = NULL,
parset = list()
)
Arguments
base.learners |
((list of) Learner)
A list of learners created with makeLearner .
|
super.learner |
(Learner | character(1))
The super learner that makes the final prediction based on the base
learners. If you pass a string, the super learner will be created via
makeLearner . Not used for method = 'average' . Default is NULL .
|
predict.type |
(character(1) )
Sets the type of the final prediction for method = 'average' . For other
methods, the predict type should be set within super.learner . If the type
of the base learner prediction, which is set up within base.learners , is
-
"prob" then predict.type = 'prob' will use the average of all
base learner predictions and predict.type = 'response' will use the
class with highest probability as final prediction.
-
"response" then, for classification tasks with predict.type = 'prob' , the final prediction will be the relative frequency based on the
predicted base learner classes and classification tasks with predict.type = 'response' will use majority vote of the base learner predictions to
determine the final prediction. For regression tasks, the final prediction
will be the average of the base learner predictions.
|
method |
(character(1) )
“average” for averaging the predictions of the base learners,
“stack.nocv” for building a super learner using the predictions of
the base learners,
“stack.cv” for building a super learner using cross-validated
predictions of the base learners.
“hill.climb” for averaging the predictions of the base learners,
with the weights learned from hill climbing algorithm and
“compress” for compressing the model to mimic the predictions of a
collection of base learners while speeding up the predictions and reducing
the size of the model. Default is “stack.nocv”,
|
use.feat |
(logical(1) )
Whether the original features should also be passed to the super learner.
Not used for method = 'average' .
Default is FALSE .
|
resampling |
(ResampleDesc)
Resampling strategy for method = 'stack.cv' .
Currently only CV is allowed for resampling.
The default NULL uses 5-fold CV.
|
parset |
the parameters for hill.climb method, including
-
replace Whether a base learner can be selected more than once.
-
init Number of best models being included before the selection algorithm.
-
bagprob The proportion of models being considered in one round of selection.
-
bagtime The number of rounds of the bagging selection.
-
metric The result evaluation metric function taking two parameters
pred and true , the smaller the score the better.
the parameters for compress method, including
k the size multiplier of the generated data
prob the probability to exchange values
s the standard deviation of each numerical feature
|
Examples
# Classification
data(iris)
tsk = makeClassifTask(data = iris, target = "Species")
base = c("classif.rpart", "classif.lda", "classif.svm")
lrns = lapply(base, makeLearner)
lrns = lapply(lrns, setPredictType, "prob")
m = makeStackedLearner(base.learners = lrns,
predict.type = "prob", method = "hill.climb")
tmp = train(m, tsk)
res = predict(tmp, tsk)
# Regression
data(BostonHousing, package = "mlbench")
tsk = makeRegrTask(data = BostonHousing, target = "medv")
base = c("regr.rpart", "regr.svm")
lrns = lapply(base, makeLearner)
m = makeStackedLearner(base.learners = lrns,
predict.type = "response", method = "compress")
tmp = train(m, tsk)
res = predict(tmp, tsk)
[Package
mlr version 2.19.2
Index]