MOE {ShapleyOutlier} | R Documentation |
Detecting cellwise outliers using Shapley values based on local outlyingness.
Description
The MOE
function indicates outlying cells for
a data vector with entries or data matrix with
entries containing only numeric entries
x
for a given center mu
and covariance matrix Sigma
using the Shapley value.
It is a more sophisticated alternative to the SCD
algorithm,
which uses the information of the regular cells to derive an alternative reference point (Mayrhofer and Filzmoser 2022).
Usage
MOE(
x,
mu,
Sigma,
Sigma_inv = NULL,
step_size = 0.1,
min_deviation = 0,
max_step = NULL,
local = TRUE,
max_iter = 1000,
q = 0.99,
check_outlyingness = FALSE,
check = TRUE,
cells = NULL,
method = "cellMCD"
)
Arguments
x |
Data vector with |
mu |
Either |
Sigma |
Either |
Sigma_inv |
Either |
step_size |
Numeric. Step size for the imputation of outlying cells, with |
min_deviation |
Numeric. Detection threshold, with |
max_step |
Either |
local |
Logical. If TRUE (default), the non-central Chi-Squared distribution is used to determine the cutoff value based on |
max_iter |
Integer. The maximum number of iterations. |
q |
Numeric. The quantile of the Chi-squared distribution for detection and imputation of outliers. Defaults to |
check_outlyingness |
Logical. If TRUE (default), the outlyingness is rechecked after applying |
check |
Logical. If |
cells |
Either |
method |
Either "cellMCD" (default) or "MCD". Specifies the method used for parameter estimation if |
Value
A list of class shapley_algorithm
(new_shapley_algorithm
) containing the following:
x |
A |
phi |
A |
mu_tilde |
A |
x_original |
A |
x_original |
The non-centrality parameters for the Chi-Squared distribution |
x_history |
A list with |
phi_history |
A list with |
mu_tilde_history |
A list with |
S_history |
A list with |
References
Mayrhofer M, Filzmoser P (2022). “Multivariate outlier explanations using Shapley values and Mahalanobis distances.” doi:10.48550/ARXIV.2210.10063.
Examples
p <- 5
mu <- rep(0,p)
Sigma <- matrix(0.9, p, p); diag(Sigma) = 1
Sigma_inv <- solve(Sigma)
x <- c(0,1,2,2.3,2.5)
MOE_x <- MOE(x = x, mu = mu, Sigma = Sigma)
plot(MOE_x)
library(MASS)
set.seed(1)
n <- 100; p <- 10
mu <- rep(0,p)
Sigma <- matrix(0.9, p, p); diag(Sigma) = 1
X <- mvrnorm(n, mu, Sigma)
X[sample(1:(n*p), 100, FALSE)] <- rep(c(-5,5),50)
MOE_X <- MOE(X, mu, Sigma)
plot(MOE_X, subset = 20)