filter_spectra {waves} | R Documentation |
Filter spectral data frame based on Mahalanobis distance
Description
Determine Mahalanobis distances of observations (rows) within a
given data.frame
with spectral data. Option to filter out
observations based on these distances.
Usage
filter_spectra(df, filter, return.distances, num.col.before.spectra,
window.size, verbose)
Arguments
df |
|
filter |
boolean that determines whether or not the input
|
return.distances |
boolean that determines whether a column of squared
Mahalanobis distances will be included in output |
num.col.before.spectra |
number of columns to the left of the spectral
matrix in |
window.size |
number defining the size of window to use when calculating the covariance of the spectra (required to calculate Mahalanobis distance). Default is 10. |
verbose |
If |
Details
This function uses a chi-square distribution with 95% cutoff where
degrees of freedom = number of wavelengths (columns) in the input
data.frame
.
Value
If filter
is TRUE
, returns filtered data frame
df
and reports the number of rows removed. The Mahalanobis distance
with a cutoff of 95% of chi-square distribution (degrees of freedom =
number of wavelengths) is used as filtering criteria. If filter
is
FALSE
, returns full input df with column h.distances
containing the Mahalanobis distance for each row.
Author(s)
Jenna Hershberger jmh579@cornell.edu
References
Johnson, R.A., and D.W. Wichern. 2007. Applied Multivariate Statistical Analysis (6th Edition). pg 189
Examples
library(magrittr)
filtered.test <- ikeogu.2017 %>%
dplyr::select(-TCC) %>%
na.omit() %>%
filter_spectra(
df = .,
filter = TRUE,
return.distances = TRUE,
num.col.before.spectra = 5,
window.size = 15
)
filtered.test[1:5, c(1:5, (ncol(filtered.test) - 5):ncol(filtered.test))]