R: Create a bagging learner

mlr_graphs_bagging {mlr3pipelines}

R Documentation

Create a bagging learner

Description

Creates a Graph that performs bagging for a supplied graph. This is done as follows:

Subsample the data in each step using PipeOpSubsample, afterwards apply graph
Replicate this step iterations times (in parallel via multiplicities)
Average outputs of replicated graphs predictions using the averager (note that setting collect_multipliciy = TRUE is required)

All input arguments are cloned and have no references in common with the returned Graph.

Usage

pipeline_bagging(
  graph,
  iterations = 10,
  frac = 0.7,
  averager = NULL,
  replace = FALSE
)

Arguments

`graph`	`PipeOp` \| `Graph` A `PipeOpLearner` or `Graph` to create a robustifying pipeline for. Outputs from the replicated `graph`s are connected with the `averager`.
`iterations`	`integer(1)` Number of bagging iterations. Defaults to 10.
`frac`	`numeric(1)` Percentage of rows to keep during subsampling. See `PipeOpSubsample` for more information. Defaults to 0.7.
`averager`	`PipeOp` \| `Graph` A `PipeOp` or `Graph` that averages the predictions from the replicated and subsampled graph's. In the simplest case, `po("classifavg")` and `po("regravg")` can be used in order to perform simple averaging of classification and regression predictions respectively. If `NULL` (default), no averager is added to the end of the graph. Note that setting `collect_multipliciy = TRUE` during construction of the averager is required.
`replace`	`logical(1)` Whether to sample with replacement. Default `FALSE`.

Value

Graph

Examples



library(mlr3)
lrn_po = po("learner", lrn("regr.rpart"))
task = mlr_tasks$get("boston_housing")
gr = pipeline_bagging(lrn_po, 3, averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()

# The original bagging method uses boosting by sampling with replacement.
gr = ppl("bagging", lrn_po, frac = 1, replace = TRUE,
  averager = po("regravg", collect_multiplicity = TRUE))
resample(task, GraphLearner$new(gr), rsmp("holdout"))$aggregate()

[Package mlr3pipelines version 0.6.0 Index]