| split {lgpr} | R Documentation |
Split data into training and test sets
Description
-
split_by_factorsplits according to given factor -
split_within_factorsplits according to given data point indices within the same level of a factor -
split_within_factor_randomselects k points from each level of a factor uniformly at random as test data -
split_randomsplits uniformly at random -
split_datasplits according to given data rows
Usage
split_by_factor(data, test, var_name = "id")
split_within_factor(data, idx_test, var_name = "id")
split_within_factor_random(data, k_test = 1, var_name = "id")
split_random(data, p_test = 0.2, n_test = NULL)
split_data(data, i_test, sort_ids = TRUE)
Arguments
data |
a data frame |
test |
the levels of the factor that will be used as test data |
var_name |
name of a factor in the data |
idx_test |
indices point indices with the factor |
k_test |
desired number of test data points per each level of the factor |
p_test |
desired proportion of test data |
n_test |
desired number of test data points (if NULL, |
i_test |
test data row indices |
sort_ids |
should the test indices be sorted into increasing order |
Value
a named list with names train, test, i_train
and i_test
See Also
Other data frame handling functions:
add_dis_age(),
add_factor_crossing(),
add_factor(),
adjusted_c_hat(),
new_x()