lmSelect {lmSubsets}R Documentation

Best-subset regression

Description

Best-variable-subset selection in ordinary linear regression.

Usage

lmSelect(formula, ...)

## Default S3 method:
lmSelect(formula, data, subset, weights, na.action,
         model = TRUE, x = FALSE, y = FALSE, contrasts = NULL,
         offset, ...)

Arguments

formula, data, subset, weights, na.action, model, x, y, contrasts, offset

standard formula interface

...

forwarded to lmSelect_fit()

Details

The lmSelect() generic provides various methods to conveniently specify the regressor and response variables. The standard formula interface (see lm()) can be used, or the model information can be extracted from an already fitted "lm" object. The model matrix and response can also be passed in directly.

After processing the arguments, the call is forwarded to lmSelect_fit().

Value

"lmSelect"—a list containing the components returned by lmSelect_fit()

Further components include call, na.action, weights, offset, contrasts, xlevels, terms, mf, x, and y. See lm() for more information.

See Also

Examples

## load data
data("AirPollution", package = "lmSubsets")


###################
##  basic usage  ##
###################

## fit 20 best subsets (BIC)
lm_best <- lmSelect(mortality ~ ., data = AirPollution, nbest = 20)
lm_best

## summary statistics
summary(lm_best)

## visualize
plot(lm_best)


########################
##  custom criterion  ##
########################

## the same as above, but with a custom criterion:
M <- nrow(AirPollution)

ll <- function (rss) {
  -M/2 * (log(2 * pi) - log(M) + log(rss) + 1)
}

aic <- function (size, rss, k = 2) {
  -2 * ll(rss) + k * (size + 1)
}

bic <- function (size, rss) {
  aic(size, rss, k = log(M))
}

lm_cust <- lmSelect(mortality ~ ., data = AirPollution,
                    penalty = bic, nbest = 20)
lm_cust

[Package lmSubsets version 0.5-2 Index]