lmrob {robustbase}  R Documentation 
MMtype Estimators for Linear Regression
Description
Computes fast MMtype estimators for linear (regression) models.
Usage
lmrob(formula, data, subset, weights, na.action, method = "MM",
model = TRUE, x = !control$compute.rd, y = FALSE,
singular.ok = TRUE, contrasts = NULL, offset = NULL,
control = NULL, init = NULL, ...)
Arguments
formula 
a symbolic description of the model to be fit. See

data 
an optional data frame, list or environment (or object
coercible by 
subset 
an optional vector specifying a subset of observations to be used in the fitting process. 
weights 
an optional vector of weights to be used in the fitting process (in addition to the robustness weights computed in the fitting process). 
na.action 
a function which indicates what should happen
when the data contain 
method 
string specifying the estimatorchain. 
model , x , y 
logicals. If 
singular.ok 
logical. If 
contrasts 
an optional list. See the 
offset 
this can be used to specify an a priori
known component to be included in the linear predictor
during fitting. An 
control 
a 
init 
an optional argument to specify or supply the initial estimate. See Details. 
... 
additional arguments can be used to specify control
parameters directly instead of (but not in addition to!) via 
Details
 Overview:

This function computes an MMtype regression estimator as described in Yohai (1987) and Koller and Stahel (2011). By default it uses a bisquare redescending score function, and it returns a highly robust and highly efficient estimator (with 50% breakdown point and 95% asymptotic efficiency for normal errors). The computation is carried out by a call to
lmrob.fit()
.The argument
setting
oflmrob.control
is provided to set alternative defaults as suggested in Koller and Stahel (2011) (setting="KS2011"
; now do use its extensionsetting="KS2014"
). For further details, seelmrob.control
.  Initial Estimator
init
: 
The initial estimator may be specified using the argument
init
. This can either bea string used to specify built in internal estimators (currently
"S"
and"MS"
, see See also below);a
function
taking argumentsx, y, control, mf
(wheremf
stands formodel.frame
) and returning alist
containing at least the initial coefficients as component"coefficients"
and the initial scale estimate as"scale"
.Or a
list
giving the initial coefficients and scale as components"coefficients"
and"scale"
. See also Examples.
Note that when
init
is a function or list, themethod
argument must not contain the initial estimator, e.g., useMDM
instead ofSMDM
.The default, equivalent to
init = "S"
, uses as initial estimator an Sestimator (Rousseeuw and Yohai, 1984) which is computed using the FastS algorithm of SalibianBarrera and Yohai (2006), callinglmrob.S()
. That function, since March 2012, by default uses nonsingular subsampling which makes the FastS algorithm feasible for categorical data as well, see Koller (2012). Note that convergence problems may still show up as warnings, e.g.,S refinements did not converge (to refine.tol=1e07) in 200 (= k.max) steps
and often can simply be remedied by increasing (i.e. weakening)
refine.tol
or increasing the allowed number of iterationsk.max
, seelmrob.control
.  Method
method
: 
The following chain of estimates is customizable via the
method
argument. There are currently two types of estimates available,"M"
:corresponds to the standard Mregression estimate.
"D"
:stands for the Design Adaptive Scale estimate as proposed in Koller and Stahel (2011).
The
method
argument takes a string that specifies the estimates to be calculated as a chain. Settingmethod='SMDM'
will result in an intial Sestimate, followed by an Mestimate, a Design Adaptive Scale estimate and a final Mstep. For methods involving aD
step, the default value ofpsi
(seelmrob.control
) is changed to"lqq"
.By default, standard errors are computed using the formulas of Croux, Dhaene and Hoorelbeke (2003) (
lmrob.control
optioncov=".vcov.avar1"
). This method, however, works only for MMestimates (i.e.,method = "MM"
or= "SM"
). For othermethod
arguments, the covariance matrix estimate used is based on the asymptotic normality of the estimated coefficients (cov=".vcov.w"
) as described in Koller and Stahel (2011). The varcov computation can be skipped bycov = "none"
and (re)done later by e.g.,vcov(<obj>, cov = ".vcov.w")
.As of robustbase version 0.910 (April 2014), the computation of robust standard errors for
method="SMDM"
has been changed. The old behaviour can be restored by setting the control parametercov.corrfact = "tauold"
.
Value
An object of class lmrob
; a list including the following
components:
coefficients 
The estimate of the coefficient vector 
scale 
The scale as used in the M estimator. 
residuals 
Residuals associated with the estimator. 
converged 

iter 
number of IRWLS iterations 
rweights 
the “robustness weights” 
fitted.values 
Fitted values associated with the estimator. 
init.S 
The 
init 
A similar list that contains the results of intermediate estimates (not for MMestimates). 
rank 
the numeric rank of the fitted linear model. 
cov 
The estimated covariance matrix of the regression coefficients 
df.residual 
the residual degrees of freedom. 
weights 
the specified weights (missing if none were used). 
na.action 
(where relevant) information returned by

offset 
the offset used (missing if none were used). 
contrasts 
(only where relevant) the contrasts used. 
xlevels 
(only where relevant) a record of the levels of the factors used in fitting. 
call 
the matched call. 
terms 
the 
model 
if requested (the default), the model frame used. 
x 
if requested, the model matrix used. 
y 
if requested, the response used. 
In addition, nonnull fits will have components assign
,
and qr
relating to the linear fit, for use by extractor
functions such as summary
.
Author(s)
(mainly:) Matias SalibianBarrera and Manuel Koller
References
Croux, C., Dhaene, G. and Hoorelbeke, D. (2003) Robust standard errors for robust estimators, Discussion Papers Series 03.16, K.U. Leuven, CES.
Koller, M. (2012)
Nonsingular subsampling for Sestimators with categorical predictors,
ArXiv eprints https://arxiv.org/abs/1208.5595;
extended version published as Koller and Stahel (2017), see
lmrob.control
.
Koller, M. and Stahel, W.A. (2011) Sharpening Waldtype inference in robust regression for small samples. Computational Statistics & Data Analysis 55(8), 2504–2515.
Maronna, R. A., and Yohai, V. J. (2000) Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 89, 197–214.
Rousseeuw, P.J. and Yohai, V.J. (1984) Robust regression by means of Sestimators, In Robust and Nonlinear Time Series, J. Franke, W. HÃ¤rdle and R. D. Martin (eds.). Lectures Notes in Statistics 26, 256–272, Springer Verlag, New York.
SalibianBarrera, M. and Yohai, V.J. (2006) A fast algorithm for Sregression estimates, Journal of Computational and Graphical Statistics 15(2), 414–427. doi:10.1198/106186006X113629
Yohai, V.J. (1987) High breakdownpoint and high efficiency estimates for regression. The Annals of Statistics 15, 642–65.
Yohai, V., Stahel, W.~A. and Zamar, R. (1991) A procedure for robust estimation and inference in linear regression; in Stahel and Weisberg (eds), Directions in Robust Statistics and Diagnostics, Part II, Springer, New York, 365–374; doi:10.1007/9781461244448_20.
See Also
lmrob.control
;
for the algorithms lmrob.S
, lmrob.M.S
and
lmrob.fit
;
and for methods,
summary.lmrob
, for the extra “statistics”,
notably R^2
(“R squared”);
predict.lmrob
,
print.lmrob
, plot.lmrob
, and
weights.lmrob
.
Examples
data(coleman)
set.seed(0)
## Default for a very long time:
summary( m1 < lmrob(Y ~ ., data=coleman) )
## Nowadays **strongly recommended** for routine use:
summary(m2 < lmrob(Y ~ ., data=coleman, setting = "KS2014") )
## 
plot(residuals(m2) ~ weights(m2, type="robustness")) ##> weights.lmrob()
abline(h=0, lty=3)
data(starsCYG, package = "robustbase")
## Plot simple data and fitted lines
plot(starsCYG)
lmST < lm(log.light ~ log.Te, data = starsCYG)
(RlmST < lmrob(log.light ~ log.Te, data = starsCYG))
abline(lmST, col = "red")
abline(RlmST, col = "blue")
## > Least Sq.:/ negative slope \ robust: slope ~= 2.2 % checked in ../tests/lmrobdata.R
summary(RlmST) # > 4 outliers; rest perfect
vcov(RlmST)
stopifnot(all.equal(fitted(RlmST),
predict(RlmST, newdata = starsCYG), tol = 1e14))
## FIXME: setting = "KS2011" or setting = "KS2014" **FAIL** here
## 'init' argument 
## 1) string
set.seed(0)
m3 < lmrob(Y ~ ., data=coleman, init = "S")
stopifnot(all.equal(m1[18], m3[18]))
## 2) function
initFun < function(x, y, control, ...) { # no 'mf' needed
init.S < lmrob.S(x, y, control)
list(coefficients=init.S$coef, scale = init.S$scale)
}
set.seed(0)
m4 < lmrob(Y ~ ., data=coleman, method = "M", init = initFun)
## list
m5 < lmrob(Y ~ ., data=coleman, method = "M",
init = list(coefficients = m3$init$coef, scale = m3$scale))
stopifnot(all.equal(m4[17], m5[17]))