R: Partial dependence for a conditional random forest.

GetPartialData {moreparty}

R Documentation

Partial dependence for a conditional random forest.

Description

Computes the partial dependence for several covariates in a conditional random forest and gathers them into a single data frame.

Usage

GetPartialData(object, xnames=NULL, ice = FALSE, center = FALSE,
               grid.resolution = NULL, quantiles = TRUE, probs = 1:9/10,
               trim.outliers = FALSE, which.class = 1L, prob = TRUE,
               pred.fun = NULL, parallel = FALSE, paropts = NULL)

Arguments

`object`	An object as returned by `cforest` (or `fastcforest`).
`xnames`	A character vector of the covariates for which to compute the partial dependence. If NULL (default), partial dependence is computed for all the covariates in the model.
`ice`	Logical indicating whether or not to compute individual conditional expectation (ICE) curves. Default is FALSE. See Goldstein et al. (2014) for details.
`center`	Logical indicating whether or not to produce centered ICE curves (c-ICE curves). Only used when ice = TRUE. Default is FALSE. See Goldstein et al. (2014) for details.
`grid.resolution`	Integer giving the number of equally spaced points to use for the continuous variables listed in `xnames`. If left NULL, it will default to the minimum between 51 and the number of unique data points for each of the continuous independent variables listed in `xnames`.
`quantiles`	Logical indicating whether or not to use the sample quantiles of the continuous predictors listed in `xnames`. If `quantiles = TRUE` and `grid.resolution = NULL` (default), the sample quantiles will be used to generate the grid of joint values for which the partial dependence is computed.
`probs`	Numeric vector of probabilities with values in [0,1]. (Values up to 2e-14 outside that range are accepted and moved to the nearby endpoint.) Default is `1:9/10` which corresponds to the deciles of the predictor variables. These specify which quantiles to use for the continuous predictors listed in `xnames` when `quantiles = TRUE`.
`trim.outliers`	Logical indicating whether or not to trim off outliers from the continuous predictors listed in `xnames` (using the simple boxplot method) before generating the grid of joint values for which the partial dependence is computed. Default is FALSE.
`which.class`	Integer specifying which column of the matrix of predicted probabilities to use as the "focus" class. Default is to use the first class. Only used for classification problems.
`prob`	Logical indicating whether or not partial dependence for classification problems should be returned on the probability scale, rather than the centered logit. If FALSE, the partial dependence function is on a scale similar to the logit. Default is TRUE.
`pred.fun`	Optional prediction function that requires two arguments: `object` and `newdata`. If specified, then the function must return a single prediction or a vector of predictions (i.e., not a matrix or data frame). Default is NULL.
`parallel`	Logical indicating whether or not to run `partial` in parallel using a backend provided by the `foreach` package. Default is FALSE.
`paropts`	List containing additional options to be passed onto `foreach` when `parallel = TRUE`.

Details

The computation of partial dependence uses partial function from pdp package for each covariate. The results are then gathered and reshaped into a friendly data frame format.

Value

A data frame with covariates, their categories and their partial dependence effects.

Author(s)

Nicolas Robette

References

J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29: 1189-1232, 2001.

Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E., Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. (2014) Journal of Computational and Graphical Statistics, 24(1): 44-65, 2015.

Examples

  data(iris)
  iris2 = iris
  iris2$Species = factor(iris$Species == "versicolor")
  iris.cf = party::cforest(Species ~ ., data = iris2, 
              controls = party::cforest_unbiased(mtry=2, ntree=50))
  GetPartialData(iris.cf)

[Package moreparty version 0.4 Index]