winsors {quest} | R Documentation |
Winsorize Numeric Data
Description
winsors
winsorizes numeric data by recoding extreme values as a user
identified boundary value, which is defined by z-score units. The to.na
argument provides the option of recoding the extreme values as missing.
Usage
winsors(
data,
vrb.nm,
z.min = -3,
z.max = 3,
rtn.int = FALSE,
to.na = FALSE,
suffix = "_win"
)
Arguments
data |
data.frame of data. |
vrb.nm |
character vector of colnames from |
z.min |
numeric vector of length 1 specifying the lower boundary value in z-score units. |
z.max |
numeric vector of length 1 specifying the upper boundary value in z-score units. |
rtn.int |
logical vector of length 1 specifying whether the recoded values should be rounded to the nearest integer. This can be useful when working with count data and decimal values are impossible. |
to.na |
logical vector of length 1 specifying whether the extreme values should be recoded to NA rather than winsorized to the boundary values. |
suffix |
character vector of length 1 specifying the string to append to the end of the colnames in the return object. |
Value
data.frame of winsorized data with extreme values recoded as either
the boundary values or NA and colnames = paste0(vrb.nm, suffix)
.
See Also
Examples
# winsorize
lapply(X = quakes[c("mag","stations")], FUN = table)
new <- winsors(quakes, vrb.nm = names(quakes))
lapply(X = new, FUN = table)
# recode as NA
vecNA(quakes)
new <- winsors(quakes, vrb.nm = names(quakes), to.na = TRUE)
vecNA(new)
# rtn.int = TRUE
winsors(data = cars, vrb.nm = names(cars), z.min = -2, z.max = 2, rtn.int = FALSE)
winsors(data = cars, vrb.nm = names(cars), z.min = -2, z.max = 2, rtn.int = TRUE)