find_mad {rempsyc} | R Documentation |
Identify outliers based on 3 MAD
Description
Identify outliers based on 3 median absolute deviations (MAD) from the median.
Usage
find_mad(data, col.list, ID = NULL, criteria = 3, mad.scores = TRUE)
Arguments
data |
The data frame. |
col.list |
List of variables to check for outliers. |
ID |
ID variable if you would like the outliers to be identified as such. |
criteria |
How many MAD to use as threshold (similar to standard deviations) |
mad.scores |
Logical, whether to output robust z (MAD) scores (default)
or raw scores. Defaults to |
Details
The function internally use scale_mad()
to "standardize" the data
based on the MAD and median, and then check for any observation greater
than the specified criteria (e.g., +/-3).
For the easystats equivalent, use:
performance::check_outliers(x, method = "zscore_robust, threshold = 3)
.
Value
A list of dataframes of outliers per variable, with row
numbers, based on the MAD. When printed, provides the number
of outliers, selected variables, and any outlier flagged for
more than one variable. More information can be obtainned
by using the attributes()
function around the generated object.
References
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013
Examples
find_mad(
data = mtcars,
col.list = names(mtcars),
criteria = 3
)
mtcars2 <- mtcars
mtcars2$car <- row.names(mtcars)
find_mad(
data = mtcars2,
col.list = names(mtcars),
ID = "car",
criteria = 3
)