find_mad {rempsyc}R Documentation

Identify outliers based on 3 MAD

Description

Identify outliers based on 3 median absolute deviations (MAD) from the median.

Usage

find_mad(data, col.list, ID = NULL, criteria = 3, mad.scores = TRUE)

Arguments

data

The data frame.

col.list

List of variables to check for outliers.

ID

ID variable if you would like the outliers to be identified as such.

criteria

How many MAD to use as threshold (similar to standard deviations)

mad.scores

Logical, whether to output robust z (MAD) scores (default) or raw scores. Defaults to TRUE.

Details

The function internally use scale_mad() to "standardize" the data based on the MAD and median, and then check for any observation greater than the specified criteria (e.g., +/-3).

For the easystats equivalent, use: ⁠performance::check_outliers(x, method = "zscore_robust, threshold = 3)⁠.

Value

A list of dataframes of outliers per variable, with row numbers, based on the MAD. When printed, provides the number of outliers, selected variables, and any outlier flagged for more than one variable. More information can be obtainned by using the attributes() function around the generated object.

References

Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median. Journal of Experimental Social Psychology, 49(4), 764–766. https://doi.org/10.1016/j.jesp.2013.03.013

Examples

find_mad(
  data = mtcars,
  col.list = names(mtcars),
  criteria = 3
)

mtcars2 <- mtcars
mtcars2$car <- row.names(mtcars)
find_mad(
  data = mtcars2,
  col.list = names(mtcars),
  ID = "car",
  criteria = 3
)

[Package rempsyc version 0.1.8 Index]