crm {crmReg} R Documentation

## Cellwise Robust M-regression

### Description

Fits a cellwise robust M-regression estimator. Besides a vector of regression coefficients, the function returns an imputed data set that contains estimates of what the values in cellwise outliers would need to amount to if they had fit the model.

### Usage

crm(formula, data, maxiter = 100, tolerance = 0.01, outlyingness.factor = 1,
spadieta = seq(0.9, 0.1, -0.1), center = "median", scale = "qn",
regtype = "MM", alphaLTS = NULL, seed = NULL, verbose = TRUE)

### Arguments

 formula an lm-style formula object specifying which relationship to estimate. data the data as a data frame. maxiter maximum number of iterations (default is 100). tolerance obtain optimal regression coefficients to within a certain tolerance (default is 0.01). outlyingness.factor numeric value, larger or equal to 1 (default). Only cells are altered of cases for which the original outlyingness (before SPADIMO) is larger than outlyingness.factor * outlyingness AFTER SPADIMO. The larger this factor, the fewer cells are imputed. spadieta the sparsity parameter to start internal outlying cell detection with, must be in the range [0,1] (default is seq(0.9, 0.1, -0.1)). center how to center the data. A string that matches the R function to be used for centering (default is "median"). scale how to scale the data. Choices are "no" (no scaling) or a string matching the R function to be used for scaling (default is "qn"). regtype type of robust regression. Choices are "MM" (default) or "LTS". alphaLTS parameter used by LTS regression. The percentage (roughly) of squared residuals whose sum will be minimized (default is 0.5). seed initial seed for random generator, like .Random.seed (default is NULL). verbose should output be shown during the process (default is TRUE).

### Details

The cellwise robust M-regression (CRM) estimator (Filzmoser et al., 2020) is a linear regression estimator that intrinsically yields both a map of cellwise outliers consistent with the linear model, and a vector of regression coefficients that is robust against vertical outliers and leverage points. As a by-product, the method yields a weighted and imputed data set that contains estimates of what the values in cellwise outliers would need to amount to if they had fit the model. The CRM method consists of an iteratively reweighted least squares procedure where SPADIMO is applied at each iteration to detect the cells that contribute most to outlyingness. As such, CRM detects deviating data cells consistent with a linear model.

### Value

crm returns a list object of class "crm" containing the following elements:

 coefficients a named vector of fitted coefficients. fitted.values the fitted response values. residuals the residuals, that is response minus fitted values. weights the (case) weights of the residuals. data.imputed the data as imputed by CRM. casewiseoutliers a vector that indicates the casewise outliers with TRUE or FALSE. cellwiseoutliers a matrix that indicates the cellwise outliers as the (scaled) difference between the original data and imputed data, both scaled and centered. terms the terms object used. call the matched call. inputs the list of supplied input arguments. numloops the number of iterations. time the number of seconds passed to execute the CRM algorithm.

### Author(s)

Peter Filzmoser, Sebastiaan Hoppner, Irene Ortner, Sven Serneels, and Tim Verdonck

### References

Filzmoser, P., Hoppner, S., Ortner, I., Serneels, S., and Verdonck, T. (2020). Cellwise Robust M regression. Computational Statistics and Data Analysis, 147, 106944. DOI:10.1016/j.csda.2020.106944

spadimo, predict.crm, cellwiseheatmap, daprpr

### Examples

library(crmReg)
data(topgear)

# fit Cellwise Robust M-regression:
crmfit <- crm(formula = MPG ~ ., data = topgear)

# estimated regression coefficients and detected casewise outliers:
print(crmfit$coefficients) print(rownames(topgear)[which(crmfit$casewiseoutliers)])

# fitted response values (MPG) versus true response values:
plot(topgear$MPG, crmfit$fitted.values, xlab = "True MPG", ylab = "Fitted MPG")
abline(a = 0, b = 1)

# residuals:
plot(crmfit$residuals, ylab = "Residuals") text(x = which(crmfit$residuals > 30), y = crmfit$residuals[which(crmfit$residuals > 30)],
labels = rownames(topgear)[which(crmfit$residuals > 30)], pos = 2) print(cbind.data.frame(car = rownames(topgear), MPG = topgear$MPG)[which(crmfit$residuals > 30), ]) # cellwise heatmap of casewise outliers: cellwiseheatmap(cellwiseoutliers = crmfit$cellwiseoutliers[which(crmfit$casewiseoutliers), ], data = round(topgear[which(crmfit$casewiseoutliers), -7], 2),
col.scale.factor = 1/4)
# check the plotted heatmap!


[Package crmReg version 1.0.2 Index]