R: AutoScore Function: Automatically splitting dataset to train,...

split_data {AutoScore}

R Documentation

AutoScore Function: Automatically splitting dataset to train, validation and test set, possibly stratified by label

Description

AutoScore Function: Automatically splitting dataset to train, validation and test set, possibly stratified by label

Usage

split_data(data, ratio, cross_validation = FALSE, strat_by_label = FALSE)

Arguments

`data`	The dataset to be split
`ratio`	The ratio for dividing dataset into training, validation and testing set. (Default: c(0.7, 0.1, 0.2))
`cross_validation`	If set to `TRUE`, cross-validation would be used for generating parsimony plot, which is suitable for small-size data. Default to `FALSE`
`strat_by_label`	If set to `TRUE`, data splitting is stratified on the outcome variable. Default to `FALSE`

Value

Returns a list containing training, validation and testing set

Examples

data("sample_data")
names(sample_data)[names(sample_data) == "Mortality_inpatient"] <- "label"
set.seed(4)
#large sample size
out_split <- split_data(data = sample_data, ratio = c(0.7, 0.1, 0.2))
#small sample size
out_split <- split_data(data = sample_data, ratio = c(0.7, 0, 0.3),
                        cross_validation = TRUE)
#large sample size, stratified
out_split <- split_data(data = sample_data, ratio = c(0.7, 0.1, 0.2),
                        strat_by_label = TRUE)

[Package AutoScore version 1.0.0 Index]