C5.0Control {C50} | R Documentation |
Various parameters that control aspects of the C5.0 fit.
C5.0Control( subset = TRUE, bands = 0, winnow = FALSE, noGlobalPruning = FALSE, CF = 0.25, minCases = 2, fuzzyThreshold = FALSE, sample = 0, seed = sample.int(4096, size = 1) - 1L, earlyStopping = TRUE, label = "outcome" )
subset |
A logical: should the model evaluate groups of
discrete predictors for splits? Note: the C5.0 command line
version defaults this parameter to |
bands |
An integer between 2 and 1000. If |
winnow |
A logical: should predictor winnowing (i.e feature selection) be used? |
noGlobalPruning |
A logical to toggle whether the final, global pruning step to simplify the tree. |
CF |
A number in (0, 1) for the confidence factor. |
minCases |
an integer for the smallest number of samples that must be put in at least two of the splits. |
fuzzyThreshold |
A logical toggle to evaluate possible advanced splits of the data. See Quinlan (1993) for details and examples. |
sample |
A value between (0, .999) that specifies the random proportion of the data should be used to train the model. By default, all the samples are used for model training. Samples not used for training are used to evaluate the accuracy of the model in the printed output. |
seed |
An integer for the random number seed within the C code. |
earlyStopping |
A logical to toggle whether the internal method for stopping boosting should be used. |
label |
A character label for the outcome used in the output. @return A list of options. |
Original GPL C code by Ross Quinlan, R code and modifications to C by Max Kuhn, Steve Weston and Nathan Coulter
Quinlan R (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, http://www.rulequest.com/see5-unix.html
C5.0()
,predict.C5.0()
,
summary.C5.0()
, C5imp()
library(modeldata) data(mlc_churn) treeModel <- C5.0(x = mlc_churn[1:3333, -20], y = mlc_churn$churn[1:3333], control = C5.0Control(winnow = TRUE)) summary(treeModel)