llm.cv {LLM} | R Documentation |
Runs v-fold cross validation with LLM
Description
In v-fold cross validation, the data are divided into v subsets of approximately equal size. Subsequently, one of the v data parts is excluded while the remaider of the data is used to create a logitleafmodel object. Predictions are generated for the excluded data part. The process is repeated v times.
Usage
llm.cv(X, Y, cv, threshold_pruning = 0.25, nbr_obs_leaf = 100)
Arguments
X |
Dataframe containing numerical independent variables. |
Y |
Numerical vector of dependent variable. Currently only binary classification is supported. |
cv |
An integer specifying the number of folds in the cross-validation. |
threshold_pruning |
Set confidence threshold for pruning. Default 0.25. |
nbr_obs_leaf |
The minimum number of observations in a leaf node. Default 100. |
Value
An object of class llm.cv, which is a list with the following components:
foldpred |
a data frame with, per fold, predicted class membership probabilities for the left-out observations |
pred |
a data frame with predicted class membership probabilities. |
foldclass |
a data frame with, per fold, predicted classes for the left-out observations. |
class |
a data frame with the predicted classes. |
conf |
the confusion matrix which compares the real versus the predicted class memberships based on the class object. |
Author(s)
Arno De Caigny, a.de-caigny@ieseg.fr, Kristof Coussement, k.coussement@ieseg.fr and Koen W. De Bock, kdebock@audencia.com
References
Arno De Caigny, Kristof Coussement, Koen W. De Bock, A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees, European Journal of Operational Research (2018), doi: 10.1016/j.ejor.2018.02.009.
See Also
predict.llm
, table.llm.html
, llm
Examples
## Load PimaIndiansDiabetes dataset from mlbench package
if (requireNamespace("mlbench", quietly = TRUE)) {
library("mlbench")
}
data("PimaIndiansDiabetes")
## Create the LLM with 5-cv
Pima.llm <- llm.cv(X = PimaIndiansDiabetes[,-c(9)],Y = PimaIndiansDiabetes$diabetes, cv=5,
threshold_pruning = 0.25,nbr_obs_leaf = 100)