R: Plot and Print Methods for Stability Selection

plot.stabsel {stabs}

R Documentation

Plot and Print Methods for Stability Selection

Description

Display results of stability selection.

Usage

## S3 method for class 'stabsel'
plot(x, main = deparse(x$call), type = c("maxsel", "paths"),
     xlab = NULL, ylab = NULL, col = NULL, ymargin = 10, np = sum(x$max > 0),
     labels = NULL, ...)
## S3 method for class 'stabsel'
print(x, decreasing = FALSE, print.all = TRUE, ...)

Arguments

`x`	object of class `stabsel`.
`main`	main title for the plot.
`type`	plot type; either stability paths (`"paths"`) or a plot of the maximum selection frequency (`"maxsel"`).
`xlab`, `ylab`	labels for the x- and y-axis of the plot. Per default, sensible labels are used depending on the `type` of the plot.
`col`	a vector of colors; Typically, one can specify a single color or one color for each variable. Per default, colors depend on the maximal selection frequency of the variable and range from grey to red.
`ymargin`	(temporarily) specifies the y margin of of the plot in lines (see argument `"mar"` of function `par`). This only affects the right margin for `type = "paths"` and the left margin for `type = "maxsel"`. Explicit user specified margins are kept and are not overwritten.
`np`	number of variables to plot for the maximum selection frequency plot (`type = "maxsel"`); the first `np` variables with highest selection frequency are plotted.
`labels`	variable labels for the plot; one label per variable / effect must be specified. Per default, the names of `x$max` are used.
`decreasing`	logical. Should the selection frequencies be printed in descending order (`TRUE`) or in ascending order (`FALSE`)?
`print.all`	logical. Should all selection frequencies be displayed or only those that are greater than zero?
`...`	additional arguments to `plot` and `print` functions.

Details

This function implements the stability selection procedure by Meinshausen and Buehlmann (2010) and the improved error bounds by Shah and Samworth (2013).

Two of the three arguments cutoff, q and PFER must be specified. The per-family error rate (PFER), i.e., the expected number of false positives E(V), where V is the number of false positives, is bounded by the argument PFER.

As controlling the PFER is more conservative as controlling the family-wise error rate (FWER), the procedure also controlls the FWER, i.e., the probability of selecting at least one non-influential variable (or model component) is less than PFER.

Value

An object of class stabsel with a special print method. The object has the following elements:

`phat`	selection probabilities.
`selected`	elements with maximal selection probability greater `cutoff`.
`max`	maximum of selection probabilities.
`cutoff`	cutoff used.
`q`	average number of selected variables used.
`PFER`	per-family error rate.
`sampling.type`	the sampling type used for stability selection.
`assumption`	the assumptions made on the selection probabilities.
`call`	the call.

References

B. Hofner, L. Boccuto and M. Goeker (2015), Controlling false discoveries in high-dimensional situations: Boosting with stability selection. BMC Bioinformatics, 16:144.
doi: 10.1186/s12859-015-0575-3.

N. Meinshausen and P. Buehlmann (2010), Stability selection. Journal of the Royal Statistical Society, Series B, 72, 417–473.

R.D. Shah and R.J. Samworth (2013), Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society, Series B, 75, 55–80.

Examples

  if (require("TH.data")) {
      ## make data set available
      data("bodyfat", package = "TH.data")
  } else {
      ## simulate some data if TH.data not available. 
      ## Note that results are non-sense with this data.
      bodyfat <- matrix(rnorm(720), nrow = 72, ncol = 10)
  }
  
  ## set seed
  set.seed(1234)

  ####################################################################
  ### using stability selection with Lasso methods:

  if (require("lars")) {
      (stab.lasso <- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
                             fitfun = lars.lasso, cutoff = 0.75,
                             PFER = 1))
      par(mfrow = c(2, 1))
      plot(stab.lasso, ymargin = 6)
      opar <- par(mai = par("mai") * c(1, 1, 1, 2.7))
      plot(stab.lasso, type = "paths")
  }

[Package stabs version 0.6-4 Index]