maha_dist {assertr} | R Documentation |
Computes mahalanobis distance for each row of data frame
Description
This function will return a vector, with the same length as the number of rows of the provided data frame, corresponding to the average mahalanobis distances of each row from the whole data set.
Usage
maha_dist(data, keep.NA = TRUE, robust = FALSE, stringsAsFactors = FALSE)
Arguments
data |
A data frame |
keep.NA |
Ensure that every row with missing data remains NA in the output? TRUE by default. |
robust |
Attempt to compute mahalanobis distance based on robust covariance matrix? FALSE by default |
stringsAsFactors |
Convert non-factor string columns into factors? FALSE by default |
Details
This is useful for finding anomalous observations, row-wise.
It will convert any categorical variables in the data frame into numerics
as long as they are factors. For example, in order for a character
column to be used as a component in the distance calculations, it must
either be a factor, or converted to a factor by using the
stringsAsFactors
parameter.
Value
A vector of observation-wise mahalanobis distances.
See Also
Examples
maha_dist(mtcars)
maha_dist(iris, robust=TRUE)
library(magrittr) # for piping operator
library(dplyr) # for "everything()" function
# using every column from mtcars, compute mahalanobis distance
# for each observation, and ensure that each distance is within 10
# median absolute deviations from the median
mtcars %>%
insist_rows(maha_dist, within_n_mads(10), everything())
## anything here will run