detectOutliers {MALDIrppa} | R Documentation |
Detection of outlying mass peak profiles
Description
This function identifies outlying cases in a collection of processed mass peak profiles. It can be applied either on peak intensities or binary data (peak presence/absence patterns). It allows to specify a grouping factor in order to execute the procedure at the desired level of aggregation.
Usage
detectOutliers(x, by = NULL, binary = FALSE, ...)
Arguments
x |
A list of |
by |
If given, a grouping variable ( |
binary |
Logical value. It indicates whether the procedure must be applied on either peak intensities ( |
... |
Optional arguments for the robust outlier detection method. |
Details
This function marks samples with mass peak profiles that largely deviates from other samples at the given aggregation level. It uses robust methods for the detection of multivariate outliers applied on metric multidimensional scaling (MDS) coordinates (Euclidean distance is used for peak intensities and binary distance for binary profiles; see dist
). The number of MDS coordinates used is generally set to p = floor
(n/2), where n is the number of samples in the target subset. This is an upper cap recommended for the computation of the robust MCD estimator by covMcd
. However, that rule of thumb can still generate matrix singularity problems with covMcd
in some cases. When this occurs detectOutliers
further reduces p to use the maximum number of MDS coordinates giving rise to a non-singular covariance matrix (min(p) = 2 in any case). The adaptive multivariate outlier detection algorithm was adapted from the mvoutlier
package.
Value
If by = NULL
, a logical vector of length equal to the number of elements of x
indicating outlying samples by TRUE
. Otherwise, a 2-column data.frame
is generated which includes such a logical vector along with the grouping variable given in by
.
Examples
# Load example data
data(spectra) # list of MassSpectra class objects
data(type) # metadata
# Some pre-processing
sc.results <- screenSpectra(spectra,meta=type)
spectra <- sc.results$fspectra # filtered mass spectra
type <- sc.results$fmeta # filtered metadata
spectra <- transformIntensity(spectra, method = "sqrt")
spectra <- wavSmoothing(spectra)
spectra <- removeBaseline(spectra)
peaks <- detectPeaks(spectra)
peaks <- alignPeaks(peaks, minFreq = 0.8)
# Find outlying samples at isolate level
out <- detectOutliers(peaks, by = type$isolate)
# From peak presence/absence patterns
out.binary <- detectOutliers(peaks, by = type$isolate, binary = TRUE)