rlm {complexlm} | R Documentation |
Robust Fitting of Linear Models, Compatible with Complex Variables
Description
Uses robust M-estimation to fit a linear model to numeric or complex data. Based on MASS::rlm.
Usage
rlm(x, ...)
## S3 method for class 'formula'
rlm(
formula,
data,
weights,
...,
subset,
na.action,
method = c("M", "MM", "model.frame"),
wt.method = c("inv.var", "case"),
model = TRUE,
x.ret = TRUE,
y.ret = FALSE,
contrasts = NULL
)
## S3 method for class 'complex'
rlm(
x,
y,
weights,
...,
w = rep(1, nrow(x)),
init = "ls",
psi = psi.huber,
scale.est = c("MAD", "Huber", "proposal 2"),
k2 = 1.345,
method = c("M", "MM"),
wt.method = c("inv.var", "case"),
maxit = 20,
acc = 1e-04,
test.vec = "resid",
lqs.control = NULL,
interc = FALSE
)
Arguments
x |
numeric or complex. A matrix, dataframe, or vector containing the explanatory / independent / predictor variables. |
... |
additional arguments to be passed to rlm.default or to the psi function. |
formula |
a formula object of the form y ~ x1 + x2. Note that algebraic expressions in formula cannot currently be used with complex data. |
data |
a data frame containing the variables upon which a robust fit is to be applied. |
weights |
numeric. A vector of weights to apply to the residuals. |
subset |
an index vector specifying the cases (rows of data or x and y) to be used for fitting. |
na.action |
a function that specifies what to do if NAs are found in the fitting data. The default is to omit them via na.omit. Can also be changed by options (na.action =). |
method |
string. What method of robust estimation should be used. Options are "M", "MM", or "model.frame". The default is M-estimation. MM-estimation has a high breakdown point but is not compatible with complex variables or case weights. model.frame just constructs the model frame, and only works with the formula method. |
wt.method |
string, either "inv.var" or "case". Specifies whether the weights are case weights that give the relative importance of each observation (higher weight means more important) / case, or the inverse variances of the cases (higher weight means that observation is less variable / uncertain). |
model |
logical. Should the model frame be included in the returned object? |
x.ret |
logical. Should the model (design) matrix be included in the returned object? |
y.ret |
logical. Should the response be included in the returned object? |
contrasts |
optional contrast specifications: see stats::lm. Not compatible with complex data. |
y |
numeric or complex. A vector containing the dependent / response variables, the same length as x. |
w |
(optional) initial down-weighting for each case |
init |
(optional) initial values for the coefficients OR a method to find initial values OR the result of a fit with a coef component. Known methods are "ls" (the default) for an initial least-squares fit using weights w*weights, and "lts" for an unweighted least-trimmed squares fit with 200 samples. |
psi |
the psi function is specified by this argument. It must give (possibly by name) a function g(x, ..., deriv) that for deriv=0 returns psi(x)/x and for deriv=1 returns psi'(x). Tuning constants will be passed in via ... |
scale.est |
method of scale estimation: re-scaled MAD of the residuals (default) or Huber's proposal 2 (which can be selected by either "Huber" or "proposal 2"). Only MAD is implemented for complex variables. |
k2 |
tuning constant used for Huber proposal 2 scale estimation. |
maxit |
maximum number of IWLS iterations. |
acc |
the accuracy for the stopping criterion. |
test.vec |
the stopping criterion is based on changes in this vector. |
lqs.control |
An optional list of control values for lqs, ignored. |
interc |
TRUE or FALSE, default is FALSE. Used with rlm.default when fitting complex valued data. If true, a y-intercept is calculated during fitting. Otherwise, the intercept is set to zero. |
Details
M-estimation works by finding the model coefficients that minimize the sum of a function of the residuals.
This function, called the objective function rho(), is a kind of statistical distance (AKA divergence), and a semimetric.
As a semimetric it is a function of the measured value y
and the modeled value Y
(residual r = y - Y
) which maps from
the space of the data to the positive real numbers. Semimetrics can be defined for domains of any dimensionality, including the
two dimensional complex numbers, and thus so can M-estimation.
What's more, all the standard algebraic operations used in the itteratively (re)weighted least-squares M-estimator robust regression
algorithm are defined over the set of complex numbers. While ordering is not defined for them, it is the output of rho(), a real number, that must be
in M-estimation.
Value
An object of class c("rzlm", "zlm", "rlm", "lm")
, or for numeric data c("rlm", "lm")
.
Objects of class "rzlm" are lists with the same components as "zlm" objects, as well as,
df.residual |
|
s |
The robust scale estimate used. |
w |
The weights used in the IWLS process. |
psi |
The psi (itterative weighting) function with parameters substituted in. |
conv |
The value of the convergence criteria at each iteration. |
converged |
Did the IWLS process converge? |
wresid |
A 'working residual', the residuals of the last re-weighted least-squares. Weighted by |
See MASS::rlm for a description of "rlm" objects.
Methods (by class)
-
formula
: S3 method for class 'formula' -
complex
: Complex Default S3 method
References
P. J. Huber (1981) Robust Statistics. Wiley.
F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel (1986) Robust Statistics: The Approach based on Influence Functions. Wiley.
A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth & Brooks/Cole.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
Examples
set.seed(4242)
n <- 8
slope <- complex(real = 4.23, imaginary = 2.323)
interc <- complex(real = 1.4, imaginary = 1.804)
e <- complex(real=rnorm(n)/6, imaginary=rnorm(n)/6)
xx <- complex(real= rnorm(n), imaginary= rnorm(n))
tframe <- data.frame(x = xx, y= slope*xx + interc + e)
rlm(y ~ x, data = tframe, weights = rep(1,n))
set.seed(4242)
n <- 8
slope <- complex(real = 4.23, imaginary = 2.323)
intercept <- complex(real = 1.4, imaginary = 1.804)
e <- complex(real=rnorm(n)/6, imaginary=rnorm(n)/6)
x <- complex(real = rnorm(n), imaginary = rnorm(n))
y <- slope * x + intercept + e
rlm(x = x, y = y, weights = rep(1,n), interc = TRUE)