tuneTrain {icardaFIGSr} | R Documentation |
Tuning and Training the Data
Description
tuneTrain splits the Data, it is an automatic function for tuning, training, and making predictions, it returns a list containing a model object, data frame and plot.
Usage
tuneTrain(
data,
y,
p = 0.7,
method = method,
parallelComputing = FALSE,
length = 10,
control = "repeatedcv",
number = 10,
repeats = 10,
process = c("center", "scale"),
summary = multiClassSummary,
positive,
...
)
Arguments
data |
object of class "data.frame" with target variable and predictor variables. |
y |
character. Target variable. |
p |
numeric. Proportion of data to be used for training. Default: 0.7 |
method |
character. Type of model to use for classification or regression. |
parallelComputing |
logical. indicates whether to also use the parallel processing. Default: False |
length |
integer. Number of values to output for each tuning parameter. If |
control |
character. Resampling method to use. Choices include: "boot", "boot632", "optimism_boot", "boot_all", "cv", "repeatedcv", "LOOCV", "LGOCV", "none", "oob", timeslice, "adaptive_cv", "adaptive_boot", or "adaptive_LGOCV". Default: "repeatedcv". See |
number |
integer. Number of cross-validation folds or number of resampling iterations. Default: 10. |
repeats |
integer. Number of folds for repeated k-fold cross-validation if "repeatedcv" is chosen as the resampling method in |
process |
character. Defines the pre-processing transformation of predictor variables to be done. Options are: "BoxCox", "YeoJohnson", "expoTrans", "center", "scale", "range", "knnImpute", "bagImpute", "medianImpute", "pca", "ica", or "spatialSign". See |
summary |
expression. Computes performance metrics across resamples. For numeric |
positive |
character. The positive class for the target variable if |
... |
additional arguments to be passed to |
Details
Types of classification and regression models available for use with tuneTrain
can be found using names(getModelInfo())
. The results given depend on the type of model used.
For classification models, class probabilities and ROC curve are given in the results. For regression models, predictions and residuals versus predicted plot are given. y
should be converted to either factor if performing classification or numeric if performing regression before specifying it in tuneTrain
.
Value
A list object with results from tuning and training the model selected in method
, together with predictions and class probabilities. The training and test data sets obtained from splitting the data are also returned.
If y
is factor, class probabilities are calculated for each class. If y
is numeric, predicted values are calculated.
A ROC curve is created if y
is factor. Otherwise, a plot of residuals versus predicted values is created if y
is numeric.
tuneTrain
relies on packages caret
, ggplot2
and plotROC
to perform the modelling and plotting.
Author(s)
Zakaria Kehel, Bancy Ngatia, Khadija Aziz
See Also
createDataPartition
,
trainControl
,
train
,
predict.train
,
ggplot
,
geom_roc
,
calc_auc
Examples
if(interactive()){
data(septoriaDurumWC)
knn.mod <- tuneTrain(data = septoriaDurumWC,y = 'ST_S',method = 'knn',positive = 'R')
nnet.mod <- tuneTrain(data = septoriaDurumWC,y = 'ST_S',method = 'nnet',positive = 'R')
}