stabsel {bamlss} | R Documentation |
Performs stability selection based on gradient boosting.
stabsel(formula, data, family = "gaussian", q, maxit, B = 100, thr = .9, fraction = 0.5, seed = NULL, ...) ## Plot selection frequencies. ## S3 method for class 'stabsel' plot(x, show = NULL, pal = function(n) gray.colors(n, start = 0.9, end = 0.3), ...)
formula |
A formula or extended formula. |
data |
A |
family |
A |
q |
An integer specifying how many terms to select in each boosting run. |
maxit |
An integer specifying the maximum number of boosting iterations.
See |
B |
An integer. The boosting is run B times. |
thr |
Cut-off threshold of relative frequencies (between 0 and 1) for selection. |
fraction |
Numeric between 0 and 1. The fraction of data to be used in each boosting run. |
seed |
A seed to be set before the stability selection. |
x |
A object of class stabsel. |
show |
Number of terms to be shown. |
pal |
Color palette for different model terms. |
... |
Not used yet in |
stabsel
performs stability selection based on gradient
boosting (opt_boost
): The boosting algorithm is run
B
times on a randomly drawn fraction
of the data
.
Each boosting run is stopped either when q
terms have been selected,
or when maxit
iterations have been performed, i.e. either q
or maxit
can be used to tune the regularization of the boosting.
After the boosting the relative selection frequencies are evaluated.
Terms with a relative selection frequency larger then thr
are suggested for a final regression model.
If neither q
nor maxit
has been specified, q
will be set to the square root of the number of columns in data
.
Gradient boosting does not depend on random numbers. Thus, the individual boosting runs differ only in the subset of data which is used.
A object of class stabsel.
Thorsten Simon
## Not run: ## Simulate some data. set.seed(111) d <- GAMart() n <- nrow(d) ## Add some noise variables. for(i in 4:9) d[[paste0("x",i)]] <- rnorm(n) f <- paste0("~ ", paste("s(x", 1:9, ")", collapse = "+", sep = "")) f <- paste(f, "+ te(lon,lat)") f <- as.formula(f) f <- list(update(f, num ~ .), f) ## Run stability selection. sel <- stabsel(f, data = d, q = 6, B = 10) plot(sel) ## Estimate selected model. nf <- formula(sel) b <- bamlss(nf, data = d) plot(b) ## End(Not run)