stabsel {bamlss} | R Documentation |
Stability selection.
Description
Performs stability selection based on gradient boosting.
Usage
stabsel(formula, data, family = "gaussian",
q, maxit, B = 100, thr = .9, fraction = 0.5, seed = NULL, ...)
## Plot selection frequencies.
## S3 method for class 'stabsel'
plot(x, show = NULL,
pal = function(n) gray.colors(n, start = 0.9, end = 0.3), ...)
Arguments
formula |
A formula or extended formula. |
data |
A |
family |
A |
q |
An integer specifying how many terms to select in each boosting run. |
maxit |
An integer specifying the maximum number of boosting iterations.
See |
B |
An integer. The boosting is run B times. |
thr |
Cut-off threshold of relative frequencies (between 0 and 1) for selection. |
fraction |
Numeric between 0 and 1. The fraction of data to be used in each boosting run. |
seed |
A seed to be set before the stability selection. |
x |
A object of class stabsel. |
show |
Number of terms to be shown. |
pal |
Color palette for different model terms. |
... |
Not used yet in |
Details
stabsel
performs stability selection based on gradient
boosting (opt_boost
): The boosting algorithm is run
B
times on a randomly drawn fraction
of the data
.
Each boosting run is stopped either when q
terms have been selected,
or when maxit
iterations have been performed, i.e. either q
or maxit
can be used to tune the regularization of the boosting.
After the boosting the relative selection frequencies are evaluated.
Terms with a relative selection frequency larger then thr
are suggested for a final regression model.
If neither q
nor maxit
has been specified, q
will be set to the square root of the number of columns in data
.
Gradient boosting does not depend on random numbers. Thus, the individual boosting runs differ only in the subset of data which is used.
Value
A object of class stabsel.
Author(s)
Thorsten Simon
Examples
## Not run: ## Simulate some data.
set.seed(111)
d <- GAMart()
n <- nrow(d)
## Add some noise variables.
for(i in 4:9)
d[[paste0("x",i)]] <- rnorm(n)
f <- paste0("~ ", paste("s(x", 1:9, ")", collapse = "+", sep = ""))
f <- paste(f, "+ te(lon,lat)")
f <- as.formula(f)
f <- list(update(f, num ~ .), f)
## Run stability selection.
sel <- stabsel(f, data = d, q = 6, B = 10)
plot(sel)
## Estimate selected model.
nf <- formula(sel)
b <- bamlss(nf, data = d)
plot(b)
## End(Not run)