crch.stabsel {crch} | R Documentation |
Auxiliary functions to perform stability selection using boosting.
Description
Auxilirary function which allows to do stability selection on heteroscedastic
crch
models based on crch.boost
.
Usage
crch.stabsel(formula, data, ..., nu = 0.1, q, B = 100, thr = 0.9,
maxit = 2000, data_percentage = 0.5)
Arguments
formula |
a formula expression of the form |
data |
an optional data frame containing the variables occurring in the formulas. |
... |
Additional attributes to control the |
nu |
Boosting step size (see |
q |
Positive |
B |
|
thr |
|
maxit |
Positive |
data_percentage |
Percentage of data which should be sampled in each of the
iterations. Default (and suggested) is |
Details
crch.boost
allows to perform gradient boosting on heteroscedastic
additive models. crch.stabsel
is a wrapper around the core crch.boost
algorithm to perform stability selection (see references).
Half of the data set (data
) is sampled B
times to perform boosting
(based on crch.boost
). Rather than perform the boosting iterations
until a certain stopping criterion is reached (e.g., maximum number of iterations
maxit
) the algorithm stops as soon as q
parameters have been selected.
The number of parameters is computed across both parameters location and scale.
Intercepts are not counted.
Value
Returns an object of class "stabsel.crch"
containing the stability
selection summary and the new formula based on the stability selection.
table |
A table object containing the parameters which have been selected and the corresponding frequency of selection. |
formula.org |
Original formula used to perform the stability selection. |
formula.new |
New formula based including the coefficients selected during stability selection. |
family |
A list object which contains the distribution-specification from
the |
parameter |
List with the parameters used to perform the stability selection
including |
References
Meinhausen N, Buehlmann P (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473. doi: 10.1111/j.1467-9868.2010.00740.x.
See Also
Examples
# generate data
suppressWarnings(RNGversion("3.5.0"))
set.seed(5)
x <- matrix(rnorm(1000*20),1000,20)
y <- rnorm(1000, 1 + x[,1] - 1.5 * x[,2], exp(-1 + 0.3*x[,3]))
y <- pmax(0, y)
data <- data.frame(cbind(y, x))
# fit model with maximum likelihood
CRCH1 <- crch(y ~ .|., data = data, dist = "gaussian", left = 0)
# Perform stability selection
stabsel <- crch.stabsel(y ~ .|., data = data, dist = "gaussian", left = 0,
q = 8, B = 5)
# Show stability selection summary
print(stabsel); plot(stabsel)
CRCH2 <- crch(stabsel$formula.new, data = data, dist = "gaussian", left = 0 )
BOOST <- crch(stabsel$formula.new, data = data, dist = "gaussian", left = 0,
control = crch.boost() )
### AIC comparison
sapply( list(CRCH1,CRCH2,BOOST), logLik )