CreateStratifiedPartition {datarobot}R Documentation

Create a stratified sampling-based S3 object of class partition for the SetTarget function

Description

Stratified partitioning is supported for binary classification problems and it randomly partitions the modeling data, keeping the percentage of positive class observations in each partition the same as in the original dataset. Stratified partitioning is supported for either Training/Validation/Holdout ("TVH") or cross-validation ("CV") splits. In either case, the holdout percentage (holdoutPct) must be specified; for the "CV" method, the number of cross-validation folds (reps) must also be specified, while for the "TVH" method, the validation subset percentage (validationPct) must be specified.

Usage

CreateStratifiedPartition(
  validationType,
  holdoutPct,
  reps = NULL,
  validationPct = NULL
)

Arguments

validationType

character. String specifying the type of partition generated, either "TVH" or "CV".

holdoutPct

integer. The percentage of data to be used as the holdout subset.

reps

integer. The number of cross-validation folds to generate; only applicable when validationType = "CV".

validationPct

integer. The percentage of data to be used as the validation subset.

Details

This function is one of several convenience functions provided to simplify the task of starting modeling projects with custom partitioning options. The other functions are CreateGroupPartition, CreateRandomPartition, and CreateUserPartition.

Value

An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a stratified partitioning of the modeling dataset.

See Also

CreateGroupPartition, CreateRandomPartition, CreateUserPartition.

Examples

CreateStratifiedPartition(validationType = "CV", holdoutPct = 20, reps = 5)

[Package datarobot version 2.18.6 Index]