get.test {ModelMap} | R Documentation |
Randomly Divide Data into Training and Test Sets
Description
Uses random selection to split a dataset into training and test data sets
Usage
get.test(proportion.test, qdatafn = NULL, seed = NULL, folder=NULL,
qdata.trainfn = paste(strsplit(qdatafn, split = ".csv")[[1]], "_train.csv", sep = ""),
qdata.testfn = paste(strsplit(qdatafn, split = ".csv")[[1]], "_test.csv", sep = ""))
Arguments
proportion.test |
Number. The proportion of the training data that will be randomly extracted for use as a test set. Value between 0 and 1. |
qdatafn |
String. The name (basename or full path) of the data file to be split into training and test data. This data should include both response and predictor variables. The file must be a comma-delimited file |
seed |
Integer. The number used to initialize randomization to randomly select rows for a test data set. If you want to produce the same model later, use the same seed. If |
folder |
String. The folder used for all output from predictions and/or maps. Do not add ending slash to path string. If |
qdata.trainfn |
String. The name of the file output of training data. By default, |
qdata.testfn |
String. The name of the file output of test data. By default, |
Details
This function should be run once, before starting analysis to create training and test sets. If the cross validation option is to be used with RF or SGB models, or if the OOB option is to be used for RF models, then this step is unnecessary.
Value
Outputs a training data file and test data file. Unless qdata.trainfn
or qdata.testfn
are specified, the output will be located in folder
. The output will have the same rows and columns as the original data.
Author(s)
Elizabeth Freeman
Examples
## Not run:
qdatafn<-system.file("extdata", "helpexamples","DATATRAIN.csv", package = "ModelMap")
qdata<-read.table(file=qdatafn,sep=",",header=TRUE,check.names=FALSE)
get.test( proportion.test=0.2,
qdatafn=qdatafn,
seed=42,
folder=getwd(),
qdata.trainfn="example.train.csv",
qdata.testfn="example.test.csv")
## End(Not run) # end dontrun