lmRob {robust} | R Documentation |
High Breakdown and High Efficiency Robust Linear Regression
Description
Performs a robust linear regression with high breakdown point and high efficiency regression.
Usage
lmRob(formula, data, weights, subset, na.action,
model = TRUE, x = FALSE, y = FALSE, contrasts = NULL,
nrep = NULL, control = lmRob.control(...), ...)
Arguments
formula |
a |
data |
a |
weights |
vector of observation weights; if supplied, the algorithm fits to minimize the sum of a function of the square root of the weights multiplied into the residuals. The length of |
subset |
expression saying which subset of the rows of the data should be used in the fit. This can be a logical vector (which is replicated to have length equal to the number of observations), or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default. |
na.action |
a function to filter missing data. This is applied to the |
model |
a logical flag: if |
x |
a logical flag: if |
y |
a logical flag: if |
contrasts |
a list giving contrasts for some or all of the factors appearing in the model formula. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels. |
nrep |
the number of random subsamples to be drawn. If |
control |
a list of control parameters to be used in the numerical algorithms. See |
... |
additional arguments are passed to the ccontrol functions. |
Details
By default, the lmRob
function automatically chooses an appropriate algorithm to compute a final robust estimate with high breakdown point and high efficiency. The final robust estimate is computed based on an initial estimate with high breakdown point. For the initial estimation, the alternate M-S estimate is used if there are any factor variables in the predictor matrix, and an S-estimate is used otherwise. To compute the S-estimate, a random resampling or a fast procedure is used unless the data set is small, in which case exhaustive resampling is employed. See lmRob.control
for how to choose between the different algorithms.
Value
a list describing the regression. Note that the solution returned here is an approximation to the true solution based upon a random algorithm (except when "Exhaustive"
resampling is chosen). Hence you will get (slightly) different answers each time if you make the same call with a different seed. See lmRob.control
for how to set the seed, and see lmRob.object
for a complete description of the object returned.
References
Gervini, D., and Yohai, V. J. (1999). A class of robust and fully efficient regression estimates; mimeo, Universidad de Buenos Aires.
Marazzi, A. (1993). Algorithms, routines, and S functions for robust statistics. Wadsworth & Brooks/Cole, Pacific Grove, CA.
Maronna, R. A., and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 89, 197–214.
Pena, D., and Yohai, V. (1999). A Fast Procedure for Outlier Diagnostics in Large Regression Problems. Journal of the American Statistical Association 94, 434–445.
Yohai, V. (1988). High breakdown-point and high efficiency estimates for regression. Annals of Statistics 15, 642–665.
Yohai, V., Stahel, W. A., and Zamar, R. H. (1991). A procedure for robust estimation and inference in linear regression; in Stahel, W. A. and Weisberg, S. W., Eds., Directions in robust statistics and diagnostics, Part II. Springer-Verlag.
See Also
Examples
data(stack.dat)
stack.rob <- lmRob(Loss ~ ., data = stack.dat)