R: Split data into training and test sets

split {lgpr}

R Documentation

Split data into training and test sets

Description

split_by_factor splits according to given factor
split_within_factor splits according to given data point indices within the same level of a factor
split_within_factor_random selects k points from each level of a factor uniformly at random as test data
split_random splits uniformly at random
split_data splits according to given data rows

Usage

split_by_factor(data, test, var_name = "id")

split_within_factor(data, idx_test, var_name = "id")

split_within_factor_random(data, k_test = 1, var_name = "id")

split_random(data, p_test = 0.2, n_test = NULL)

split_data(data, i_test, sort_ids = TRUE)

Arguments

`data`	a data frame
`test`	the levels of the factor that will be used as test data
`var_name`	name of a factor in the data
`idx_test`	indices point indices with the factor
`k_test`	desired number of test data points per each level of the factor
`p_test`	desired proportion of test data
`n_test`	desired number of test data points (if NULL, `p_test` is used to compute this)
`i_test`	test data row indices
`sort_ids`	should the test indices be sorted into increasing order

Value

a named list with names train, test, i_train and i_test

Split data into training and test sets

Description

Usage

Arguments

Value

See Also