tunedivfor {diversityForest}R Documentation

Optimization of the values of the tuning parameters nsplits and proptry

Description

First, both for nsplits and proptry a grid of possible values may be provided, where default grids are used if no grids are provided. Second, for each pairwise combination of values from these two grids a forest is constructed. Third, that pair of nsplits and proptry values is used as the optimized set of parameter values that is associated with the smallest out-of-bag prediction error. If several pairs of parameter values are associated with the same smallest out-of-bag prediction error, the pair with the smallest (parameter) values is used.

Usage

tunedivfor(
  formula = NULL,
  data = NULL,
  nsplitsgrid = c(2, 5, 10, 30, 50, 100, 200),
  proptrygrid = c(0.05, 1),
  num.trees.pre = 500
)

Arguments

formula

Object of class formula or character describing the model to fit. Interaction terms supported only for numerical variables.

data

Training data of class data.frame, matrix, dgCMatrix (Matrix) or gwaa.data (GenABEL).

nsplitsgrid

Grid of values to consider for nsplits. Default grid: 2, 5, 10, 30, 50, 100, 200.

proptrygrid

Grid of values to consider for proptry. Default grid: 0.05, 1.

num.trees.pre

Number of trees used for each forest constructed during tuning parameter optimization. Default is 500.

Value

List with elements

nsplitsopt

Optimized value of nsplits.

proptryopt

Optimized value of proptry.

tunegrid

Two-dimensional data.frame, where each row contains one pair of values considered for nsplits (first entry) and proptry (second entry).

ooberrs

The out-of-bag prediction errors obtained for each pair of values considered for nsplits and proptry, where the ordering of pairs of values is the same as in tunegrid (see above).

Author(s)

Roman Hornung

References

See Also

divfor

Examples


## Load package:

library("diversityForest")


## Set seed to obtain reproducible results:

set.seed(1234)


## Tuning parameter optimization for the iris data set:

tuneres <- tunedivfor(formula = Species ~ ., data = iris, num.trees.pre = 20)
# NOTE: num.trees.pre = 20 is specified too small for practical 
# purposes - the out-of-bag error estimates of the forests 
# constructed during optimization will be much too variable!!
# In practice, num.trees.pre = 500 (default value) or a 
# larger number should be used.

tuneres

tuneres$nsplitsopt
tuneres$proptryopt
tuneres$tunegrid
tuneres$ooberrs


[Package diversityForest version 0.4.0 Index]