R: Split into test and train data sets

split_test_train {wactor}

R Documentation

Split into test and train data sets

Description

Randomly partition input into a list of train and test data sets

Usage

split_test_train(.data, .p = 0.8, ...)

Arguments

`.data`	Input data. If atomic (numeric, integer, character, etc.), the input is first converted to a data frame with a column name of "x."
`.p`	Proportion of data that should be used for the `train` data set output. The default value is 0.80, meaning the `train` output will include roughly 80 pct. of the input cases while the `test` output will include roughly 20 oct..
`...`	Optional. The response (outcome) variable. Uses tidy evaluation (quotes are not necessary). This is only relevant if the identified variable is categorical–i.e., character, factor, logical–in which case it is used to ensure a uniform distribution for the `train` output data set. If a value is supplied, uniformity in response level observations is prioritized over the `.p` (train proportion) value.

Value

A list with train and test tibbles (data.frames)

Examples


## example data frame
d <- data.frame(
  x = rnorm(100),
  y = rnorm(100),
  z = c(rep("a", 80), rep("b", 20))
)

## split using defaults
split_test_train(d)

## split 0.60/0.40
split_test_train(d, 0.60)

## split with equal response level obs
split_test_train(d, 0.80, label = z)

## apply to atomic data
split_test_train(letters)

[Package wactor version 0.0.1 Index]