mutate_other {hutils} | R Documentation |
Group infrequent entries into 'Other category'
Description
Useful when you want to constrain the number of unique values in a column by keeping only the most common values.
Usage
mutate_other(
.data,
var,
n = 5,
count,
by = NULL,
var.weight = NULL,
mass = NULL,
copy = TRUE,
other.category = "Other"
)
Arguments
.data |
Data containing variable. |
var |
Variable containing infrequent entries, to be collapsed into "Other". |
n |
Threshold for total number of categories above "Other". |
count |
Threshold for total count of observations before "Other". |
by |
Extra variables to group by when calculating |
var.weight |
Variable to act as a weight: |
mass |
Threshold for sum of |
copy |
Should |
other.category |
Value that infrequent entries are to be collapsed into. Defaults to |
Value
.data
but with var
changed so that infrequent values have the same value (other.category
).
Examples
library(data.table)
library(magrittr)
DT <- data.table(City = c("A", "A", "B", "B", "C", "D"),
value = c(1, 9, 4, 4, 5, 11))
DT %>%
mutate_other("City", var.weight = "value", mass = 10) %>%
.[]