fsmult {fsdaR} | R Documentation |
Gives an automatic outlier detection procedure in multivariate analysis
Description
Gives an automatic outlier detection procedure in multivariate analysis and performs forward search in multivariate analysis with exploratory data
Usage
fsmult(
x,
bsb,
monitoring = FALSE,
crit = c("md", "biv", "uni"),
rf = 0.95,
init,
plot = FALSE,
bonflev,
msg = TRUE,
nocheck = FALSE,
scaled = FALSE,
trace = FALSE,
...
)
Arguments
x |
An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations. |
bsb |
List of units forming the initial subset or size of the initial subset.
If Remark: if bsb is a vector, the option crit is ignored. |
monitoring |
Wheather to perform monitoring of Mahalanobis distances and other specific quantities |
crit |
If specified, the criterion to be used to initialize the search.
Remark: as the user can see the starting point of the search is not going to affect at all the results of the analysis. The user can explore this point with his own datasets. Remark: if |
rf |
Confidence level for bivariate ellipses. The default is 0.95. This option is useful only if |
init |
Point where to start monitoring required diagnostics. Note that if a vector |
plot |
Plots the minimum Mahalanobis distance. If
|
bonflev |
Option that might be used to identify extreme outliers when the distribution of
the data is strongly non normal. In these circumstances, the general signal detection rule
based on consecutive exceedances cannot be used. In this case
Default value is empty, which means to rely on general rules based on consecutive exceedances. |
msg |
It controls whether to display or not messages on the screen. If |
nocheck |
It controls whether to perform checks on matrix Y. If |
scaled |
Controls whether to monitor scaled Mahalanobis distances (only if |
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
Value
Depending on the input parameter monitoring
, one of
the following objects will be returned:
Author(s)
FSDA team, valentin.todorov@chello.at
References
Riani, M., Atkinson A.C., Cerioli A. (2009). Finding an unknown number of multivariate outliers. Journal of the Royal Statistical Society Series B, Vol. 71, pp. 201-221.
Cerioli A., Farcomeni A., Riani M., (2014). Strong consistency and robustness of the Forward Search estimator of multivariate location and scatter, Journal of Multivariate Analysis, Vol. 126, pp. 167-183, http://dx.doi.org/10.1016/j.jmva.2013.12.010.
Atkinson Riani and Cerioli (2004), Exploring multivariate data with the forward search Springer Verlag, New York.
Examples
## Not run:
data(hbk, package="robustbase")
(out <- fsmult(hbk[,1:3]))
class(out)
summary(out)
## Generate contaminated data (200,3)
n <- 200
p <- 3
set.seed(123456)
X <- matrix(rnorm(n*p), nrow=n)
Xcont <- X
Xcont[1:5, ] <- Xcont[1:5,] + 3
out1 <- fsmult(Xcont, trace=TRUE) # no plots (plot defaults to FALSE)
names(out1)
(out1 <- fsmult(Xcont, trace=TRUE, plot=TRUE)) # identical to plot=1
## plot=1 - minimum MD with envelopes based on n observations
## and the scatterplot matrix with the outliers highlighted
(out1 <- fsmult(Xcont, trace=TRUE, plot=1))
## plot=2 - additional plots of envelope resuperimposition
(out1 <- fsmult(Xcont, trace=TRUE, plot=2))
## plots is a list: plots showing envelope superimposition in normal coordinates.
(out1 <- fsmult(Xcont, trace=TRUE, plot=list(ncoord=1)))
## Choosing an initial subset formed by the three observations with
## the smallest Mahalanobis Distance.
(out1 <- fsmult(Xcont, m0=5, crit="md", trace=TRUE))
## fsmult() with monitoring
(out2 <- fsmult(Xcont, monitoring=TRUE, trace=TRUE))
names(out2)
## Monitor the exceedances from m=200 without showing plots.
n <- 1000
p <- 10
Y <- matrix(rnorm(10000), ncol=10)
(out <- fsmult(Y, init=200))
## Forgery Swiss banknotes examples.
data(swissbanknotes)
## Monitor the exceedances of Minimum Mahalanobis Distance
(out1 <- fsmult(swissbanknotes[101:200,], plot=1))
## Control minimum and maximum on the x axis
(out1 <- fsmult(swissbanknotes[101:200,], plot=list(xlim=c(60,90))))
## Monitor the exceedances of Minimum Mahalanobis Distance using
## normal coordinates for mmd.
(out1 <- fsmult(swissbanknotes[101:200,], plot=list(ncoord=1)))
## End(Not run)