CreateStratifiedPartition {datarobot} | R Documentation |
Create a stratified sampling-based S3 object of class partition for the SetTarget function
Description
Stratified partitioning is supported for binary classification problems and it randomly partitions the modeling data, keeping the percentage of positive class observations in each partition the same as in the original dataset. Stratified partitioning is supported for either Training/Validation/Holdout ("TVH") or cross-validation ("CV") splits. In either case, the holdout percentage (holdoutPct) must be specified; for the "CV" method, the number of cross-validation folds (reps) must also be specified, while for the "TVH" method, the validation subset percentage (validationPct) must be specified.
Usage
CreateStratifiedPartition(
validationType,
holdoutPct,
reps = NULL,
validationPct = NULL
)
Arguments
validationType |
character. String specifying the type of partition generated, either "TVH" or "CV". |
holdoutPct |
integer. The percentage of data to be used as the holdout subset. |
reps |
integer. The number of cross-validation folds to generate; only applicable when validationType = "CV". |
validationPct |
integer. The percentage of data to be used as the validation subset. |
Details
This function is one of several convenience functions provided to simplify the task
of starting modeling projects with custom partitioning options. The other
functions are CreateGroupPartition
, CreateRandomPartition
, and
CreateUserPartition
.
Value
An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a stratified partitioning of the modeling dataset.
See Also
CreateGroupPartition
, CreateRandomPartition
,
CreateUserPartition
.
Examples
CreateStratifiedPartition(validationType = "CV", holdoutPct = 20, reps = 5)