| RFTrainer {superml} | R Documentation |
Random Forest Trainer
Description
Trains a random forest model.
Details
Trains a Random Forest model. A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting. This implementation uses ranger R package which provides faster model training.
Public fields
n_estimatorsthe number of trees in the forest, default= 100
max_featuresthe number of features to consider when looking for the best split. Possible values are
auto(default)takes sqrt(num_of_features),sqrtsame as auto,logtakes log(num_of_features),nonetakes all featuresmax_depththe maximum depth of each tree
min_node_sizethe minumum number of samples required to split an internal node
criterionthe function to measure the quality of split. For classification,
giniis used which is a measure of gini index. For regression, thevarianceof responses is used.classificationwhether to train for classification (1) or regression (0)
verboseshow computation status and estimated runtime
seedseed value
class_weightsweights associated with the classes for sampling of training observation
always_splitvector of feature names to be always used for splitting
importanceVariable importance mode, one of 'none', 'impurity', 'impurity_corrected', 'permutation'. The 'impurity' measure is the Gini index for classification, the variance of the responses for regression. Defaults to "impurity"
Methods
Public methods
Method new()
Usage
RFTrainer$new( n_estimators, max_depth, max_features, min_node_size, classification, class_weights, always_split, verbose, save_model, seed, importance )
Arguments
n_estimatorsinteger, the number of trees in the forest, default= 100
max_depthinteger, the maximum depth of each tree
max_featuresinteger, the number of features to consider when looking for the best split. Possible values are
auto(default)takes sqrt(num_of_features),sqrtsame as auto,logtakes log(num_of_features),nonetakes all featuresmin_node_sizeinteger, the minumum number of samples required to split an internal node
classificationinteger, whether to train for classification (1) or regression (0)
class_weightsweights associated with the classes for sampling of training observation
always_splitvector of feature names to be always used for splitting
verboselogical, show computation status and estimated runtime
save_modellogical, whether to save model
seedinteger, seed value
importanceVariable importance mode, one of 'none', 'impurity', 'impurity_corrected', 'permutation'. The 'impurity' measure is the Gini index for classification, the variance of the responses for regression. Defaults to "impurity"
Details
Create a new 'RFTrainer' object.
Returns
A 'RFTrainer' object.
Examples
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
Method fit()
Usage
RFTrainer$fit(X, y)
Arguments
Xdata.frame containing train features
ycharacter, name of the target variable
Details
Trains the random forest model
Returns
NULL, trains and saves the model in memory
Examples
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
Method predict()
Usage
RFTrainer$predict(df)
Arguments
dfdata.frame containing test features
Details
Return predictions from random forest model
Returns
a vector containing predictions
Examples
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
predictions <- bst$predict(iris)
Method get_importance()
Usage
RFTrainer$get_importance()
Details
Returns feature importance from the model
Returns
a data frame containing feature predictions
Examples
data("iris")
bst <- RFTrainer$new(n_estimators=50,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
predictions <- bst$predict(iris)
bst$get_importance()
Method clone()
The objects of this class are cloneable with this method.
Usage
RFTrainer$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
Examples
## ------------------------------------------------
## Method `RFTrainer$new`
## ------------------------------------------------
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
## ------------------------------------------------
## Method `RFTrainer$fit`
## ------------------------------------------------
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
## ------------------------------------------------
## Method `RFTrainer$predict`
## ------------------------------------------------
data("iris")
bst <- RFTrainer$new(n_estimators=10,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
predictions <- bst$predict(iris)
## ------------------------------------------------
## Method `RFTrainer$get_importance`
## ------------------------------------------------
data("iris")
bst <- RFTrainer$new(n_estimators=50,
max_depth=4,
classification=1,
seed=42,
verbose=TRUE)
bst$fit(iris, 'Species')
predictions <- bst$predict(iris)
bst$get_importance()