R: Super Learner wrapper for a ranger object with variable...

SL.ranger.imp {flevr}

R Documentation

Super Learner wrapper for a ranger object with variable importance

Description

Super Learner wrapper for a ranger object with variable importance

Usage

SL.ranger.imp(
  Y,
  X,
  newX,
  family,
  obsWeights = rep(1, length(Y)),
  num.trees = 500,
  mtry = floor(sqrt(ncol(X))),
  write.forest = TRUE,
  probability = family$family == "binomial",
  min.node.size = ifelse(family$family == "gaussian", 5, 1),
  replace = TRUE,
  sample.fraction = ifelse(replace, 1, 0.632),
  num.threads = 1,
  verbose = FALSE,
  importance = "impurity",
  ...
)

Arguments

`Y`	Outcome variable
`X`	Training dataframe
`newX`	Test dataframe
`family`	Gaussian or binomial
`obsWeights`	Observation-level weights
`num.trees`	Number of trees.
`mtry`	Number of variables to possibly split at in each node. Default is the (rounded down) square root of the number variables.
`write.forest`	Save ranger.forest object, required for prediction. Set to FALSE to reduce memory usage if no prediction intended.
`probability`	Grow a probability forest as in Malley et al. (2012).
`min.node.size`	Minimal node size. Default 1 for classification, 5 for regression, 3 for survival, and 10 for probability.
`replace`	Sample with replacement.
`sample.fraction`	Fraction of observations to sample. Default is 1 for sampling with replacement and 0.632 for sampling without replacement.
`num.threads`	Number of threads to use.
`verbose`	If TRUE, display additional output during execution.
`importance`	Variable importance mode, one of 'none', 'impurity', 'impurity_corrected', 'permutation'. The 'impurity' measure is the Gini index for classification, the variance of the responses for regression and the sum of test statistics (see `splitrule`) for survival.
`...`	Any additional arguments, not currently used.

Value

a named list with elements pred (predictions on newX) and fit (the fitted ranger object).

References

Breiman, L. (2001). Random forests. Machine learning 45:5-32.

Wright, M. N. & Ziegler, A. (2016). ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, in press. http://arxiv.org/abs/1508.04409.

Examples

data("biomarkers")
# subset to complete cases for illustration
cc <- complete.cases(biomarkers)
dat_cc <- biomarkers[cc, ]
# use only the mucinous outcome, not the high-malignancy outcome
y <- dat_cc$mucinous
x <- dat_cc[, !(names(dat_cc) %in% c("mucinous", "high_malignancy"))]
feature_nms <- names(x)
# get the fit
set.seed(20231129)
fit <- SL.ranger.imp(Y = y, X = x, newX = x, family = binomial())
fit

[Package flevr version 0.0.4 Index]

Super Learner wrapper for a ranger object with variable importance

Description

Usage

Arguments

Value

References

See Also

Examples