ltsReg {robustbase} | R Documentation |
Least Trimmed Squares Robust (High Breakdown) Regression
Description
Carries out least trimmed squares (LTS) robust (high breakdown point) regression.
Usage
ltsReg(x, ...)
## S3 method for class 'formula'
ltsReg(formula, data, subset, weights, na.action,
model = TRUE, x.ret = FALSE, y.ret = FALSE,
contrasts = NULL, offset, ...)
## Default S3 method:
ltsReg(x, y, intercept = TRUE, alpha = , nsamp = , adjust = ,
mcd = TRUE, qr.out = FALSE, yname = NULL,
seed = , trace = , use.correction = , wgtFUN = , control = rrcov.control(),
...)
Arguments
formula |
a |
data |
data frame from which variables specified in
|
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
an optional vector of weights to be used in the fitting process. NOT USED YET. |
na.action |
a function which indicates what should happen
when the data contain |
model , x.ret , y.ret |
|
contrasts |
an optional list. See the |
offset |
this can be used to specify an a priori
known component to be included in the linear predictor
during fitting. An |
x |
a matrix or data frame containing the explanatory variables. |
y |
the response: a vector of length the number of rows of |
.
intercept |
if true, a model with constant term will be
estimated; otherwise no constant term will be included. Default is
|
alpha |
the percentage (roughly) of squared residuals whose sum will be
minimized, by default 0.5. In general, |
nsamp |
number of subsets used for initial estimates or
|
adjust |
whether to perform intercept adjustment at each step.
Since this can be time consuming, the default is |
mcd |
whether to compute robust distances using Fast-MCD. |
qr.out |
whether to return the QR decomposition (see
|
yname |
the name of the dependent variable. Default is |
seed |
initial seed for random generator, like
|
trace |
logical (or integer) indicating if intermediate results
should be printed; defaults to |
use.correction |
whether to use finite sample correction factors.
Default is |
wgtFUN |
a character string or |
control |
a list with estimation options - same as these provided in the function specification. If the control object is supplied, the parameters from it will be used. If parameters are passed also in the invocation statement, they will override the corresponding elements of the control object. |
... |
arguments passed to or from other methods. |
Details
The LTS regression method minimizes the sum of the h
smallest
squared residuals, where h > n/2
, i.e. at least half the number of
observations must be used. The default value of h
(when
alpha=1/2
) is roughly n / 2
, more precisely,
(n+p+1) %/% 2
where n
is the
total number of observations, but by setting alpha
, the user
may choose higher values up to n, where
h = h(\alpha,n,p) =
h.alpha.n(alpha,n,p)
. The LTS
estimate of the error scale is given by the minimum of the objective
function multiplied by a consistency factor
and a finite sample correction factor – see Pison et al. (2002)
for details. The rescaling factors for the raw and final estimates are
returned also in the vectors raw.cnp2
and cnp2
of
length 2 respectively. The finite sample corrections can be suppressed
by setting use.correction=FALSE
. The computations are performed
using the Fast LTS algorithm proposed by Rousseeuw and Van Driessen (1999).
As always, the formula interface has an implied intercept term which can be
removed either by y ~ x - 1
or y ~ 0 + x
. See
formula
for more details.
Value
The function ltsReg
returns an object of class "lts"
.
The summary
method function is used to obtain (and
print) a summary table of the results, and plot()
can be used to plot them, see the the specific help pages.
The generic accessor functions coefficients
,
fitted.values
and residuals
extract various useful features of the value returned by
ltsReg
.
An object of class lts
is a list
containing at
least the following components:
crit |
the value of the objective function of the LTS regression method,
i.e., the sum of the |
coefficients |
vector of coefficient estimates (including the intercept by default when
|
best |
the best subset found and used for computing the raw estimates, with
|
fitted.values |
vector like |
residuals |
vector like |
scale |
scale estimate of the reweighted residuals. |
alpha |
same as the input parameter |
quan |
the number |
intercept |
same as the input parameter |
cnp2 |
a vector of length two containing the consistency correction factor and the finite sample correction factor of the final estimate of the error scale. |
raw.coefficients |
vector of raw coefficient estimates (including
the intercept, when |
raw.scale |
scale estimate of the raw residuals. |
raw.resid |
vector like |
raw.cnp2 |
a vector of length two containing the consistency correction factor and the finite sample correction factor of the raw estimate of the error scale. |
lts.wt |
vector like y containing weights that can be used in a weighted least squares. These weights are 1 for points with reasonably small residuals, and 0 for points with large residuals. |
raw.weights |
vector containing the raw weights based on the raw residuals and raw scale. |
method |
character string naming the method (Least Trimmed Squares). |
X |
the input data as a matrix (including intercept column if applicable). |
Y |
the response variable as a vector. |
Author(s)
Valentin Todorov valentin.todorov@chello.at, based on work written for S-plus by Peter Rousseeuw and Katrien van Driessen from University of Antwerp.
References
Peter J. Rousseeuw (1984), Least Median of Squares Regression. Journal of the American Statistical Association 79, 871–881.
P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection. Wiley.
P. J. Rousseeuw and K. van Driessen (1999) A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223.
Pison, G., Van Aelst, S., and Willems, G. (2002) Small Sample Corrections for LTS and MCD. Metrika 55, 111-123.
See Also
covMcd
;
summary.lts
for summaries,
lmrob()
for alternative robust estimator with HBDP.
The generic functions coef
, residuals
,
fitted
.
Examples
data(heart)
## Default method works with 'x'-matrix and y-var:
heart.x <- data.matrix(heart[, 1:2]) # the X-variables
heart.y <- heart[,"clength"]
ltsReg(heart.x, heart.y)
data(stackloss)
ltsReg(stack.loss ~ ., data = stackloss)