spadimo {crmReg} | R Documentation |
SPArse DIrections of Maximal Outlyingness
Description
Computes the sparse directions of maximal outlyings of a given observation and shows diagnostic plots for analyzing that observation.
Usage
spadimo(data, weights, obs,
control = list(scaleFun = Qn, nlatent = 1, etas = NULL, csqcritv = 0.975,
stopearly = FALSE, trace = FALSE, plot = TRUE))
Arguments
data |
the data as a data frame. |
weights |
a numeric vector containing the case weights from a robust estimator. |
obs |
the (integer) case number under consideration. |
control |
a list of options that control details of the
|
Details
Given an observation that has been detected as an outlier, SPADIMO (Debruyne et al., 2019) finds the subset of variables contributing most the outlier’s outlyingness. Here, the outlyingness of a data point is defined as its robust Mahalanobis distance. The relevant variables are found by checking the direction in which the observation is most outlying. SPADIMO estimates this direction of maximal outlyingness in a sparse manner. Thereby, the method helps to understand in which way an outlier lies out.
Value
spadimo
returns a list containing the following elements:
outlvars |
vector containing individual variable names contributing most to |
outlvarslist |
list of variables contributing to |
a |
vector, the sparse direction of maximal outlyingness. |
alist |
list of sparse directions of maximal outlyingness for different values of |
o.before |
outlyingness of original case (n < p) or PCA outlier flag (n >= p) before removing outlying variables. |
o.after |
outlyingness of reduced case (n > p) or PCA outlier flag (n >= p) after removing outlying variables. |
eta |
cutoff where |
time |
time to execute the SPADIMO algorithm. |
control |
a list with control parameters that are used. |
Author(s)
Michiel Debruyne, Sebastiaan Hoppner, Sven Serneels, and Tim Verdonck
References
Debruyne, M., Hoppner, S., Serneels, S., and Verdonck, T. (2019). Outlyingness: Which variables contribute most? Statistics and Computing, 29 (4), 707–723. DOI:10.1007/s11222-018-9831-5
See Also
crm
, predict.crm
, cellwiseheatmap
, daprpr
Examples
library(crmReg)
data(topgear)
# get case weights from a robust estimator (covMCD function in robustbase package):
MCD <- robustbase::covMcd(topgear, alpha = 0.5)
# SPADIMO with diagnostic plots:
# Example 1:
Peugeot <- spadimo(data = topgear,
weights = MCD$mcd.wt,
obs = which(rownames(topgear) == "Peugeot 107"))
# check the plots!
# individual variable names contributing most to Peugeot 107's outlyingness:
print(Peugeot$outlvars)
# sparse direction of maximal outlyingness with eta = Peugeot$eta:
print(Peugeot$a)
# default SPADIMO control parameters:
print(Peugeot$control)
# Example 2:
Bugatti <- spadimo(data = topgear,
weights = MCD$mcd.wt,
obs = which(rownames(topgear) == "Bugatti Veyron"),
control = list(stopearly = TRUE, trace = TRUE, plot = TRUE))
# check the plots!
# individual variable names contributing most to Bugatti Veyron's outlyingness:
print(Bugatti$outlvars)
# sparse direction of maximal outlyingness with eta = Bugatti$eta:
print(Bugatti$a)