## SPArse DIrections of Maximal Outlyingness

### Description

Computes the sparse directions of maximal outlyings of a given observation and shows diagnostic plots for analyzing that observation.

### Usage

spadimo(data, weights, obs,
control = list(scaleFun = Qn, nlatent = 1, etas = NULL, csqcritv  = 0.975,
stopearly = FALSE, trace = FALSE, plot = TRUE))

### Arguments

 data the data as a data frame. weights a numeric vector containing the case weights from a robust estimator. obs the (integer) case number under consideration. control a list of options that control details of the crm algorithm. The following options are available: scaleFun function used for robust scaling the variables (e.g. Qn, mad, etc.). nlatent integer number of latent variables for sparse PLS regression (via SNIPLS) (default is 1). etas vector of decreasing sparsity parameters (default is NULL in which case etas = seq(0.9, 0.1, -0.05) if n > p, otherwise etas = seq(0.6, 0.1, -0.05)). csqcritv probability level for internal chi-squared quantile (used when n > p) (default is 0.975). stopearly if TRUE, method stops as soon as the reduced case is no longer outlying, else if FALSE (default) it loops through all values of eta. trace should intermediate results be printed (default is FALSE). plot should heatmaps and graph of the results be shown (default is TRUE).

### Details

Given an observation that has been detected as an outlier, SPADIMO (Debruyne et al., 2019) finds the subset of variables contributing most the outlier’s outlyingness. Here, the outlyingness of a data point is defined as its robust Mahalanobis distance. The relevant variables are found by checking the direction in which the observation is most outlying. SPADIMO estimates this direction of maximal outlyingness in a sparse manner. Thereby, the method helps to understand in which way an outlier lies out.

### Value

spadimo returns a list containing the following elements:

 outlvars vector containing individual variable names contributing most to obs's outlyingness. outlvarslist list of variables contributing to obs's outlyingness for different values of eta. a vector, the sparse direction of maximal outlyingness. alist list of sparse directions of maximal outlyingness for different values of eta. o.before outlyingness of original case (n < p) or PCA outlier flag (n >= p) before removing outlying variables. o.after outlyingness of reduced case (n > p) or PCA outlier flag (n >= p) after removing outlying variables. eta cutoff where obs is no longer outlying. time time to execute the SPADIMO algorithm. control a list with control parameters that are used.

### Author(s)

Michiel Debruyne, Sebastiaan Hoppner, Sven Serneels, and Tim Verdonck

### References

Debruyne, M., Hoppner, S., Serneels, S., and Verdonck, T. (2019). Outlyingness: Which variables contribute most? Statistics and Computing, 29 (4), 707–723. DOI:10.1007/s11222-018-9831-5

crm, predict.crm, cellwiseheatmap, daprpr

### Examples

library(crmReg)
data(topgear)

# get case weights from a robust estimator (covMCD function in robustbase package):
MCD <- robustbase::covMcd(topgear, alpha = 0.5)

# Example 1:
weights = MCD$mcd.wt, obs = which(rownames(topgear) == "Peugeot 107")) # check the plots! # individual variable names contributing most to Peugeot 107's outlyingness: print(Peugeot$outlvars)
# sparse direction of maximal outlyingness with eta = Peugeot$eta: print(Peugeot$a)
print(Peugeot$control) # Example 2: Bugatti <- spadimo(data = topgear, weights = MCD$mcd.wt,
print(Bugatti$outlvars) # sparse direction of maximal outlyingness with eta = Bugatti$eta: