R: Fit an eXtreme RuleFit model

xrf.formula {xrf}

R Documentation

Fit an eXtreme RuleFit model

Description

See Friedman & Popescu (2008) for a description of the general RuleFit algorithm. This method uses XGBoost to fit a tree ensemble, extracts a ruleset as the conjunction of tree traversals, and fits a sparse linear model to the resulting feature set (including the original feature set) using glmnet.

Usage

## S3 method for class 'formula'
xrf(
  object,
  data,
  family,
  xgb_control = list(nrounds = 100, max_depth = 3),
  glm_control = list(type.measure = "deviance", nfolds = 5),
  sparse = TRUE,
  prefit_xgb = NULL,
  deoverlap = FALSE,
  ...
)

Arguments

`object`	a formula prescribing features to use in the model. transformation of the response variable is not supported. when using transformations on the input features (not suggested in general) it is suggested to set sparse=F
`data`	a data frame with columns corresponding to the formula
`family`	the family of the fitted model. one of 'gaussian', 'binomial', 'multinomial'
`xgb_control`	a list of parameters for xgboost. must supply an nrounds argument
`glm_control`	a list of parameters for the glmnet fit. must supply a type.measure and nfolds arguments (for the lambda cv)
`sparse`	whether a sparse design matrix should be used
`prefit_xgb`	an xgboost model (of class xgb.Booster) to be used instead of the model that `xrf` would normally fit
`deoverlap`	if true, the tree derived rules are deoverlapped, in that the deoverlapped rule set contains no overlapped rules
`...`	ignored arguments

References

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954.

Examples

m <- xrf(Petal.Length ~ ., iris,
         xgb_control = list(nrounds = 2, max_depth = 2),
         family = 'gaussian')

[Package xrf version 0.2.2 Index]