lmSelect_fit {lmSubsets} | R Documentation |
Best-subset regression
Description
Low-level interface to best-variable-subset selection in ordinary linear regression.
Usage
lmSelect_fit(x, y, weights = NULL, offset = NULL, include = NULL,
exclude = NULL, penalty = "BIC", tolerance = 0,
nbest = 1, ..., pradius = NULL)
Arguments
x |
|
y |
|
weights |
|
offset |
|
include |
|
exclude |
|
penalty |
|
tolerance |
|
nbest |
|
... |
ignored |
pradius |
|
Details
The best variable-subset model is determined, where the "best" model is the one with the lowest information criterion value. The information criterion belongs to the AIC family.
The regression data is specified with the x
, y
,
weights
, and offset
parameters. See
lm.fit()
for further details.
To force regressors into or out of the regression, a list of
regressors can be passed as an argument to the include
or
exclude
parameters, respectively.
The information criterion is specified with the penalty
parameter. Accepted values are "AIC"
, "BIC"
, or a
"numeric"
value representing the penalty-per-model-parameter.
A custom selection criterion may be specified by passing an R
function as an argument. The expected signature is function
(size, rss)
, where size
is the number of predictors (including
the intercept, if any), and rss
is the residual sum of squares.
The function must be non-decreasing in both parameters.
An approximation tolerance
can be specified to speed up the
search.
The number of returned submodels is determined by the nbest
parameter.
The preordering radius is given with the pradius
parameter.
Value
A list
with the following components:
NOBS |
|
nobs |
|
nvar |
|
weights |
|
intercept |
|
include |
|
exclude |
|
size |
|
ic |
information criterion |
tolerance |
|
nbest |
|
submodel |
|
subset |
|
References
Hofmann M, Gatu C, Kontoghiorghes EJ, Colubi A, Zeileis A (2020). lmSubsets: Exact variable-subset selection in linear regression for R. Journal of Statistical Software, 93, 1–21. doi: 10.18637/jss.v093.i03.
See Also
lmSelect()
for the high-level interfacelmSubsets_fit()
for all-subsets regression
Examples
data("AirPollution", package = "lmSubsets")
x <- as.matrix(AirPollution[, names(AirPollution) != "mortality"])
y <- AirPollution[, names(AirPollution) == "mortality"]
f <- lmSelect_fit(x, y)
f