plot.stabsel {stabs} | R Documentation |
Plot and Print Methods for Stability Selection
Description
Display results of stability selection.
Usage
## S3 method for class 'stabsel'
plot(x, main = deparse(x$call), type = c("maxsel", "paths"),
xlab = NULL, ylab = NULL, col = NULL, ymargin = 10, np = sum(x$max > 0),
labels = NULL, ...)
## S3 method for class 'stabsel'
print(x, decreasing = FALSE, print.all = TRUE, ...)
Arguments
x |
object of class |
main |
main title for the plot. |
type |
plot type; either stability paths ( |
xlab , ylab |
labels for the x- and y-axis of the plot. Per
default, sensible labels are used depending on the |
col |
a vector of colors; Typically, one can specify a single color or one color for each variable. Per default, colors depend on the maximal selection frequency of the variable and range from grey to red. |
ymargin |
(temporarily) specifies the y margin of of the plot in
lines (see argument |
np |
number of variables to plot for the maximum selection
frequency plot ( |
labels |
variable labels for the plot; one label per variable / effect
must be specified. Per default, the names of |
decreasing |
logical. Should the selection frequencies be printed
in descending order ( |
print.all |
logical. Should all selection frequencies be displayed or only those that are greater than zero? |
... |
additional arguments to |
Details
This function implements the stability selection procedure by Meinshausen and Buehlmann (2010) and the improved error bounds by Shah and Samworth (2013).
Two of the three arguments cutoff
, q
and PFER
must be specified. The per-family error rate (PFER), i.e., the
expected number of false positives E(V)
, where V
is the
number of false positives, is bounded by the argument PFER
.
As controlling the PFER is more conservative as controlling the
family-wise error rate (FWER), the procedure also controlls the FWER,
i.e., the probability of selecting at least one non-influential
variable (or model component) is less than PFER
.
Value
An object of class stabsel
with a special print
method.
The object has the following elements:
phat |
selection probabilities. |
selected |
elements with maximal selection probability greater
|
max |
maximum of selection probabilities. |
cutoff |
cutoff used. |
q |
average number of selected variables used. |
PFER |
per-family error rate. |
sampling.type |
the sampling type used for stability selection. |
assumption |
the assumptions made on the selection probabilities. |
call |
the call. |
References
B. Hofner, L. Boccuto and M. Goeker (2015), Controlling false
discoveries in high-dimensional situations: Boosting with stability
selection. BMC Bioinformatics, 16:144.
doi: 10.1186/s12859-015-0575-3.
N. Meinshausen and P. Buehlmann (2010), Stability selection. Journal of the Royal Statistical Society, Series B, 72, 417–473.
R.D. Shah and R.J. Samworth (2013), Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society, Series B, 75, 55–80.
See Also
Examples
if (require("TH.data")) {
## make data set available
data("bodyfat", package = "TH.data")
} else {
## simulate some data if TH.data not available.
## Note that results are non-sense with this data.
bodyfat <- matrix(rnorm(720), nrow = 72, ncol = 10)
}
## set seed
set.seed(1234)
####################################################################
### using stability selection with Lasso methods:
if (require("lars")) {
(stab.lasso <- stabsel(x = bodyfat[, -2], y = bodyfat[,2],
fitfun = lars.lasso, cutoff = 0.75,
PFER = 1))
par(mfrow = c(2, 1))
plot(stab.lasso, ymargin = 6)
opar <- par(mai = par("mai") * c(1, 1, 1, 2.7))
plot(stab.lasso, type = "paths")
}