train_test_split {less}R Documentation

Dataset splitting

Description

Split dataframes or matrices into random train and test subsets. Takes the column at the y_index of data as response variable (y) and the rest as the independent variables (X)

Usage

train_test_split(
  data,
  test_size = 0.3,
  random_state = NULL,
  y_index = ncol(data)
)

Arguments

data

Dataset that is going to be split

test_size

Represents the proportion of the dataset to include in the test split. Should be between 0.0 and 1.0 (defaults to 0.3)

random_state

Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls (defaults to NULL)

y_index

Corresponding column index of the response variable y (defaults to last column of data)

Value

A list of length 4 with elements:

X_train Training input variables
X_test Test input variables
y_train Training response variables
y_test Test response variables

Examples

data(abalone)
split_list <- train_test_split(abalone, test_size =  0.3)
X_train <- split_list[[1]]
X_test <- split_list[[2]]
y_train <- split_list[[3]]
y_test <- split_list[[4]]

print(head(X_train))
print(head(X_test))
print(head(y_train))
print(head(y_test))

[Package less version 0.1.0 Index]