wBACON_reg {wbacon} | R Documentation |
Robust Fitting Linear Regression Models by the BACON Algorithm
Description
The weighted BACON algorithm is a robust method to fit weighted linear regression models. The method is robust against outlier in the response variable and the design matrix (leverage observation).
Usage
wBACON_reg(formula, weights = NULL, data, collect = 4, na.rm = FALSE,
alpha = 0.05, version = c("V2", "V1"), maxiter = 50, verbose = FALSE,
original = FALSE, n_threads = 2)
## S3 method for class 'wbaconlm'
print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'wbaconlm'
summary(object, ...)
## S3 method for class 'wbaconlm'
fitted(object, ...)
## S3 method for class 'wbaconlm'
residuals(object, ...)
## S3 method for class 'wbaconlm'
coef(object, ...)
## S3 method for class 'wbaconlm'
vcov(object, ...)
Arguments
formula |
an object of class |
weights |
|
data |
a |
collect |
determines the size |
na.rm |
|
alpha |
|
version |
method to initialize the basic subset, |
maxiter |
|
verbose |
|
original |
|
n_threads |
|
digits |
|
object |
object of class |
x |
object of class |
... |
additional arguments passed to the method. |
Details
First, the wBACON
method is applied to the model's design
matrix (having removed the regression intercept/constant, if there is
a constant) to establish a subset of observations which is supposed to
be free of outliers. Second, the so generated subset is regressed onto
the corresponding subset of response variables. The subset is iteratively
enlarged to include as many “good” observations as possible.
The original approach of Billor et al. (2000) obtains by specifying
the argument original = TRUE
.
Models for wBACON_reg
are specified symbolically. A typical model
has the form response ~ terms
, where response
is the
(numeric) response vector and terms
is a series of terms
which specifies a linear predictor for response.
A formula
has an implied intercept term. To remove this use
either y ~ x - 1
or y ~ 0 + x
. See formula
or lm
for for more details.
The weights
argument can be used to specify sampling weights or
case weights.
It is not possible to fit multiple response variables (on the r.h.s. of the formula, i.e. multivariate models) in one call.
The method cannot deal with missing values. If the argument
na.rm
is set to TRUE
the method behaves like
na.omit
.
Assumptions
The algorithm assumes that the non-outlying data follow a linear (homoscedastic) regression model and that the independent variables have (roughly) an elliptically contoured distribution. “Although the algorithms will often do something reasonable even when these assumptions are violated, it is hard to say what the results mean.” (Billor et al., 2000, p. 289)
In line with Billor et al. (2000, p. 290), we use the term outlier “nomination” rather than “detection” to highlight that algorithms should not go beyond nominating observations as potential outliers. It is left to the analyst to finally label outlying observations as such.
Utility functions and tools
The generic functions coef
, fitted
, residuals
,
and vcov
extract the estimate coefficients, fitted values,
residuals, and the covariance matrix of the estimated coefficients.
The function summary
summarizes the estimated model.
Value
An object of class wbaconlm
with slots
coefficients |
a named vector of coefficients |
residuals |
the residuals (for all observations in the data.frame not only the ones in the final subset |
rank |
the numeric rank of the fitted linear model (i.e.. number of variables in the design matrix |
fitted.values |
fitted values |
df.residual |
the residual degrees of freedom (computed for the observations in the final subset) |
call |
the matched call |
terms |
the |
model |
the |
weights |
weights |
qr |
the |
subset |
the subset |
reg |
a list with additional details on |
mv |
a list with details on the results of |
References
Billor N., Hadi A.S. and Vellemann P.F. (2000). BACON: Blocked Adaptive Computationally efficient Outlier Nominators. Computational Statistics and Data Analysis 34, pp. 279–298. doi:10.1016/S0167-9473(99)00101-2
Schoch, T. (2021). wbacon: Weighted BACON algorithms for multivariate outlier nomination (detection) and robust linear regression, Journal of Open Source Software 6 (62), 3238 doi:10.21105/joss.03238
See Also
plot
gives diagnostic plots for an
wbaconlm
object.
predict
is used for prediction (incl.
confidence and prediction intervals).
Examples
data(iris)
m <- wBACON_reg(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,
data = iris)
m
summary(m)