tunedivfor {diversityForest} | R Documentation |
Optimization of the values of the tuning parameters nsplits
and proptry
Description
First, both for nsplits
and proptry
a grid of possible values may be provided,
where default grids are used if no grids are provided. Second, for each pairwise combination of
values from these two grids a forest is constructed. Third,
that pair of nsplits
and proptry
values is used as the optimized set of parameter
values that is associated with the smallest out-of-bag prediction error. If several pairs of
parameter values are associated with the same smallest out-of-bag prediction error, the
pair with the smallest (parameter) values is used.
Usage
tunedivfor(
formula = NULL,
data = NULL,
nsplitsgrid = c(2, 5, 10, 30, 50, 100, 200),
proptrygrid = c(0.05, 1),
num.trees.pre = 500
)
Arguments
formula |
Object of class |
data |
Training data of class |
nsplitsgrid |
Grid of values to consider for |
proptrygrid |
Grid of values to consider for |
num.trees.pre |
Number of trees used for each forest constructed during tuning parameter optimization. Default is 500. |
Value
List with elements
nsplitsopt |
Optimized value of |
proptryopt |
Optimized value of |
tunegrid |
Two-dimensional |
ooberrs |
The out-of-bag prediction errors obtained for each pair of values considered for |
Author(s)
Roman Hornung
References
Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <doi:10.1007/s42979-021-00920-1>.
Wright, M. N., Ziegler, A. (2017). ranger: A fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software 77:1-17, <doi:10.18637/jss.v077.i01>.
See Also
Examples
## Load package:
library("diversityForest")
## Set seed to obtain reproducible results:
set.seed(1234)
## Tuning parameter optimization for the iris data set:
tuneres <- tunedivfor(formula = Species ~ ., data = iris, num.trees.pre = 20)
# NOTE: num.trees.pre = 20 is specified too small for practical
# purposes - the out-of-bag error estimates of the forests
# constructed during optimization will be much too variable!!
# In practice, num.trees.pre = 500 (default value) or a
# larger number should be used.
tuneres
tuneres$nsplitsopt
tuneres$proptryopt
tuneres$tunegrid
tuneres$ooberrs