R: Get modeling metrics

modelingSummary {icardaFIGSr}

R Documentation

Get modeling metrics

Description

modelingSummary is an automatic function for modeling data, it returns a dataframe containing the metrics of the modeling using five machine learning algorithms: KNN, SVM, RF, NNET, and Bcart. This function is based on spliData, tuneTrain, predict, and getMetrics functions.

Usage

modelingSummary(
  data,
  y,
  p = 0.7,
  length = 10,
  control = "repeatedcv",
  number = 10,
  repeats = 10,
  process = c("center", "scale"),
  summary = multiClassSummary,
  positive,
  parallelComputing = FALSE,
  classtype,
  ...
)

Arguments

`data`	object of class "data.frame" with target variable and predictor variables.
`y`	character. Target variable.
`p`	numeric. Proportion of data to be used for training. Default: 0.7
`length`	integer. Number of values to output for each tuning parameter. If `search = "random"` is passed to `trainControl` through `...`, this becomes the maximum number of tuning parameter combinations that are generated by the random search. Default: 10.
`control`	character. Resampling method to use. Choices include: "boot", "boot632", "optimism_boot", "boot_all", "cv", "repeatedcv", "LOOCV", "LGOCV", "none", "oob", timeslice, "adaptive_cv", "adaptive_boot", or "adaptive_LGOCV". Default: "repeatedcv". See `train` for specific details on the resampling methods.
`number`	integer. Number of cross-validation folds or number of resampling iterations. Default: 10.
`repeats`	integer. Number of folds for repeated k-fold cross-validation if "repeatedcv" is chosen as the resampling method in `control`. Default: 10.
`process`	character. Defines the pre-processing transformation of predictor variables to be done. Options are: "BoxCox", "YeoJohnson", "expoTrans", "center", "scale", "range", "knnImpute", "bagImpute", "medianImpute", "pca", "ica", or "spatialSign". See `preProcess` for specific details on each pre-processing transformation. Default: c('center', 'scale').
`summary`	expression. Computes performance metrics across resamples. For numeric `y`, the mean squared error and R-squared are calculated. For factor `y`, the overall accuracy and Kappa are calculated. See `trainControl` and `defaultSummary` for details on specification and summary options. Default: multiClassSummary.
`positive`	character. The positive class for the target variable if `y` is factor. Usually, it is the first level of the factor.
`parallelComputing`	logical. indicates whether to also use the parallel processing. Default: False
`classtype`	integer.indicates the number of classes of the traits.
`...`	additional arguments to be passed to `createDataPartition`, `trainControl` and `train` functions in the package `caret`.

Details

Types of classification and regression models available for use with tuneTrain can be found using names(getModelInfo()). The results given depend on the type of model used.

Value

A dataframe contains the metrics of the modeling of five machine learning algorithms: KNN, SVM, RF, NNET, and Bcart.

tuneTrain relies on package caret to perform the modeling.

Author(s)

Zakaria Kehel, Khadija Aziz

Examples

if(interactive()){
 data(septoriaDurumWC)
 models <- modelingSummary(data = septoriaDurumWC, y = "ST_S", positive = "R", classtype = 2)
}

[Package icardaFIGSr version 1.0.2 Index]