h2o4gpu.random_forest_classifier {h2o4gpu} | R Documentation |
Random Forest Classifier
Description
Random Forest Classifier
Usage
h2o4gpu.random_forest_classifier(n_estimators = 100L, criterion = "gini",
max_depth = 3L, min_samples_split = 2L, min_samples_leaf = 1L,
min_weight_fraction_leaf = 0, max_features = "auto",
max_leaf_nodes = NULL, min_impurity_decrease = 0,
min_impurity_split = NULL, bootstrap = TRUE, oob_score = FALSE,
n_jobs = 1L, random_state = NULL, verbose = 0L, warm_start = FALSE,
class_weight = NULL, subsample = 1, colsample_bytree = 1,
num_parallel_tree = 1L, tree_method = "gpu_hist", n_gpus = -1L,
predictor = "gpu_predictor", backend = "h2o4gpu")
Arguments
n_estimators |
The number of trees in the forest. |
criterion |
The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "entropy" for the information gain. Note: this parameter is tree-specific. |
max_depth |
The maximum depth of the tree. If NULL, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. |
min_samples_split |
The minimum number of samples required to split an internal node: |
min_samples_leaf |
The minimum number of samples required to be at a leaf node: |
min_weight_fraction_leaf |
The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided. |
max_features |
The number of features to consider when looking for the best split: |
max_leaf_nodes |
Grow trees with |
min_impurity_decrease |
A node will be split if this split induces a decrease of the impurity greater than or equal to this value. |
min_impurity_split |
Threshold for early stopping in tree growth. A node will split if its impurity is above the threshold, otherwise it is a leaf. |
bootstrap |
Whether bootstrap samples are used when building trees. |
oob_score |
whether to use out-of-bag samples to estimate the R^2 on unseen data. |
n_jobs |
The number of jobs to run in parallel for both |
random_state |
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If NULL, the random number generator is the RandomState instance used by |
verbose |
Controls the verbosity of the tree building process. |
warm_start |
When set to |
class_weight |
"balanced_subsample" or NULL, optional (default=NULL) Weights associated with classes in the form |
subsample |
Subsample ratio of the training instance. |
colsample_bytree |
Subsample ratio of columns when constructing each tree. |
num_parallel_tree |
Number of trees to grow per round |
tree_method |
The tree construction algorithm used in XGBoost Distributed and external memory version only support approximate algorithm. Choices: ‘auto’, ‘exact’, ‘approx’, ‘hist’, ‘gpu_exact’, ‘gpu_hist’ ‘auto’: Use heuristic to choose faster one. - For small to medium dataset, exact greedy will be used. - For very large-dataset, approximate algorithm will be chosen. - Because old behavior is always use exact greedy in single machine, - user will get a message when approximate algorithm is chosen to notify this choice. ‘exact’: Exact greedy algorithm. ‘approx’: Approximate greedy algorithm using sketching and histogram. ‘hist’: Fast histogram optimized approximate greedy algorithm. It uses some performance improvements such as bins caching. ‘gpu_exact’: GPU implementation of exact algorithm. ‘gpu_hist’: GPU implementation of hist algorithm. |
n_gpus |
Number of gpu's to use in RandomForestClassifier solver. Default is -1. |
predictor |
The type of predictor algorithm to use. Provides the same results but allows the use of GPU or CPU. - 'cpu_predictor': Multicore CPU prediction algorithm. - 'gpu_predictor': Prediction using GPU. Default for 'gpu_exact' and 'gpu_hist' tree method. |
backend |
Which backend to use. Options are 'auto', 'sklearn', 'h2o4gpu'. Saves as attribute for actual backend used. |