R: Random Forest Classifier

h2o4gpu.random_forest_classifier {h2o4gpu}

R Documentation

Random Forest Classifier

Description

Random Forest Classifier

Usage

h2o4gpu.random_forest_classifier(n_estimators = 100L, criterion = "gini",
  max_depth = 3L, min_samples_split = 2L, min_samples_leaf = 1L,
  min_weight_fraction_leaf = 0, max_features = "auto",
  max_leaf_nodes = NULL, min_impurity_decrease = 0,
  min_impurity_split = NULL, bootstrap = TRUE, oob_score = FALSE,
  n_jobs = 1L, random_state = NULL, verbose = 0L, warm_start = FALSE,
  class_weight = NULL, subsample = 1, colsample_bytree = 1,
  num_parallel_tree = 1L, tree_method = "gpu_hist", n_gpus = -1L,
  predictor = "gpu_predictor", backend = "h2o4gpu")

Arguments

`n_estimators`	The number of trees in the forest.
`criterion`	The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "entropy" for the information gain. Note: this parameter is tree-specific.
`max_depth`	The maximum depth of the tree. If NULL, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.
`min_samples_split`	The minimum number of samples required to split an internal node:
`min_samples_leaf`	The minimum number of samples required to be at a leaf node:
`min_weight_fraction_leaf`	The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided.
`max_features`	The number of features to consider when looking for the best split:
`max_leaf_nodes`	Grow trees with `max_leaf_nodes` in best-first fashion. Best nodes are defined as relative reduction in impurity. If NULL then unlimited number of leaf nodes.
`min_impurity_decrease`	A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
`min_impurity_split`	Threshold for early stopping in tree growth. A node will split if its impurity is above the threshold, otherwise it is a leaf.
`bootstrap`	Whether bootstrap samples are used when building trees.
`oob_score`	whether to use out-of-bag samples to estimate the R^2 on unseen data.
`n_jobs`	The number of jobs to run in parallel for both `fit` and `predict`. If -1, then the number of jobs is set to the number of cores.
`random_state`	If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If NULL, the random number generator is the RandomState instance used by `np.random`.
`verbose`	Controls the verbosity of the tree building process.
`warm_start`	When set to `TRUE`, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.
`class_weight`	"balanced_subsample" or NULL, optional (default=NULL) Weights associated with classes in the form `{class_label: weight}`. If not given, all classes are supposed to have weight one. For multi-output problems, a list of dicts can be provided in the same order as the columns of y.
`subsample`	Subsample ratio of the training instance.
`colsample_bytree`	Subsample ratio of columns when constructing each tree.
`num_parallel_tree`	Number of trees to grow per round
`tree_method`	The tree construction algorithm used in XGBoost Distributed and external memory version only support approximate algorithm. Choices: ‘auto’, ‘exact’, ‘approx’, ‘hist’, ‘gpu_exact’, ‘gpu_hist’ ‘auto’: Use heuristic to choose faster one. - For small to medium dataset, exact greedy will be used. - For very large-dataset, approximate algorithm will be chosen. - Because old behavior is always use exact greedy in single machine, - user will get a message when approximate algorithm is chosen to notify this choice. ‘exact’: Exact greedy algorithm. ‘approx’: Approximate greedy algorithm using sketching and histogram. ‘hist’: Fast histogram optimized approximate greedy algorithm. It uses some performance improvements such as bins caching. ‘gpu_exact’: GPU implementation of exact algorithm. ‘gpu_hist’: GPU implementation of hist algorithm.
`n_gpus`	Number of gpu's to use in RandomForestClassifier solver. Default is -1.
`predictor`	The type of predictor algorithm to use. Provides the same results but allows the use of GPU or CPU. - 'cpu_predictor': Multicore CPU prediction algorithm. - 'gpu_predictor': Prediction using GPU. Default for 'gpu_exact' and 'gpu_hist' tree method.
`backend`	Which backend to use. Options are 'auto', 'sklearn', 'h2o4gpu'. Saves as attribute for actual backend used.

[Package h2o4gpu version 0.3.3 Index]