partition {mlr3} | R Documentation |
Manually Partition into Training and Test Set
Description
Creates a split of the row ids of a Task into a training set and a test set while optionally stratifying on the target column.
For more complex partitions, see the example.
Usage
partition(task, ratio = 0.67, stratify = TRUE, ...)
## S3 method for class 'TaskRegr'
partition(task, ratio = 0.67, stratify = TRUE, bins = 3L, ...)
## S3 method for class 'TaskClassif'
partition(task, ratio = 0.67, stratify = TRUE, ...)
Arguments
task |
(Task) |
ratio |
( |
stratify |
( |
... |
(any) |
bins |
( |
Examples
# regression task
task = tsk("boston_housing")
# roughly equal size split while stratifying on the binned response
split = partition(task, ratio = 0.5)
data = data.frame(
y = c(task$truth(split$train), task$truth(split$test)),
split = rep(c("train", "predict"), lengths(split))
)
boxplot(y ~ split, data = data)
# classification task
task = tsk("pima")
split = partition(task)
# roughly same distribution of the target label
prop.table(table(task$truth()))
prop.table(table(task$truth(split$train)))
prop.table(table(task$truth(split$test)))
# splitting into 3 disjunct sets, using ResamplingCV and stratification
task = tsk("iris")
task$set_col_roles(task$target_names, add_to = "stratum")
r = rsmp("cv", folds = 3)$instantiate(task)
sets = lapply(1:3, r$train_set)
lengths(sets)
prop.table(table(task$truth(sets[[1]])))
[Package mlr3 version 0.20.2 Index]