lmSelect {lmSubsets} | R Documentation |
Best-subset regression
Description
Best-variable-subset selection in ordinary linear regression.
Usage
lmSelect(formula, ...)
## Default S3 method:
lmSelect(formula, data, subset, weights, na.action,
model = TRUE, x = FALSE, y = FALSE, contrasts = NULL,
offset, ...)
Arguments
formula , data , subset , weights , na.action , model , x , y , contrasts , offset |
standard formula interface |
... |
forwarded to |
Details
The lmSelect()
generic provides various methods to conveniently
specify the regressor and response variables. The standard formula
interface (see lm()
) can be used, or the model
information can be extracted from an already fitted "lm"
object. The model matrix and response can also be passed in directly.
After processing the arguments, the call is forwarded to
lmSelect_fit()
.
Value
"lmSelect"
—a list
containing the components returned
by lmSelect_fit()
Further components include call
, na.action
,
weights
, offset
, contrasts
, xlevels
,
terms
, mf
, x
, and y
. See
lm()
for more information.
See Also
lmSelect.matrix()
for the matrix interfacelmSelect.lmSubsets()
for coercing an all-subsets regressionlmSelect_fit()
for the low-level interfacelmSubsets()
for all-subsets regression
Examples
## load data
data("AirPollution", package = "lmSubsets")
###################
## basic usage ##
###################
## fit 20 best subsets (BIC)
lm_best <- lmSelect(mortality ~ ., data = AirPollution, nbest = 20)
lm_best
## summary statistics
summary(lm_best)
## visualize
plot(lm_best)
########################
## custom criterion ##
########################
## the same as above, but with a custom criterion:
M <- nrow(AirPollution)
ll <- function (rss) {
-M/2 * (log(2 * pi) - log(M) + log(rss) + 1)
}
aic <- function (size, rss, k = 2) {
-2 * ll(rss) + k * (size + 1)
}
bic <- function (size, rss) {
aic(size, rss, k = log(M))
}
lm_cust <- lmSelect(mortality ~ ., data = AirPollution,
penalty = bic, nbest = 20)
lm_cust