within_n_sds {assertr} | R Documentation |
Return a function to create z-score checking predicate
Description
This function takes one argument, the number of standard deviations within which to accept a particular data point.
Usage
within_n_sds(n, ...)
Arguments
n |
The number of standard deviations from the mean within which to accept a datum |
... |
Additional arguments to be passed to |
Details
As an example, if '2' is passed into this function, this will return
a function that takes a vector and figures out the bounds of two
standard deviations from the mean. That function will then return
a within_bounds
function that can then be applied
to a single datum. If the datum is within two standard deviations of
the mean of the vector given to the function returned by this function,
it will return TRUE. If not, FALSE.
This function isn't meant to be used on its own, although it can. Rather,
this function is meant to be used with the insist
function to
search for potentially erroneous data points in a data set.
Value
A function that takes a vector and returns a
within_bounds
predicate based on the standard deviation
of that vector.
See Also
Examples
test.vector <- rnorm(100, mean=100, sd=20)
within.one.sd <- within_n_sds(1)
custom.bounds.checker <- within.one.sd(test.vector)
custom.bounds.checker(105) # returns TRUE
custom.bounds.checker(40) # returns FALSE
# same as
within_n_sds(1)(test.vector)(40) # returns FALSE
within_n_sds(2)(test.vector)(as.numeric(NA)) # returns TRUE
# because, by default, within_bounds() will accept
# NA values. If we want to reject NAs, we have to
# provide extra arguments to this function
within_n_sds(2, allow.na=FALSE)(test.vector)(as.numeric(NA)) # returns FALSE
# or in a pipeline, like this was meant for
library(magrittr)
iris %>%
insist(within_n_sds(5), Sepal.Length)