R: Train-Test-Split

train_test_split {creditmodel}

R Documentation

Train-Test-Split

Description

train_test_split Functions for partition of data.

Usage

train_test_split(
  dat,
  prop = 0.7,
  split_type = "Random",
  occur_time = NULL,
  cut_date = NULL,
  start_date = NULL,
  save_data = FALSE,
  dir_path = tempdir(),
  file_name = NULL,
  note = FALSE,
  seed = 43
)

Arguments

`dat`	A data.frame with independent variables and target variable.
`prop`	The percentage of train data samples after the partition.
`split_type`	Methods for partition. "Random" is to split train & test set randomly. "OOT" is to split by time for observation over time test. "byRow" is to split by rownumbers.
`occur_time`	The name of the variable that represents the time at which each observation takes place. It is used for "OOT" split.
`cut_date`	Time points for spliting data sets, e.g. : spliting Actual and Expected data sets.
`start_date`	The earliest occurrence time of observations.
`save_data`	Logical, save results in locally specified folder. Default is FALSE.
`dir_path`	The path for periodically saved data file. Default is "./data".
`file_name`	The name for periodically saved data file. Default is "dat".
`note`	Logical. Outputs info. Default is TRUE.
`seed`	Random number seed. Default is 46.

Value

A list of indices (train-test)

Examples

train_test = train_test_split(lendingclub,
split_type = "OOT", prop = 0.7,
occur_time = "issue_d", seed = 12, save_data = FALSE)
dat_train = train_test$train
dat_test = train_test$test

[Package creditmodel version 1.3.1 Index]