R: Recode values

recode {dplyr}

R Documentation

Recode values

Description

recode() is superseded in favor of case_match(), which handles the most important cases of recode() with a more elegant interface. recode_factor() is also superseded, however, its direct replacement is not currently available but will eventually live in forcats. For creating new variables based on logical vectors, use if_else(). For even more complicated criteria, use case_when().

recode() is a vectorised version of switch(): you can replace numeric values based on their position or their name, and character or factor values only by their name. This is an S3 generic: dplyr provides methods for numeric, character, and factors. You can use recode() directly with factors; it will preserve the existing order of levels while changing the values. Alternatively, you can use recode_factor(), which will change the order of levels to match the order of replacements.

Usage

recode(.x, ..., .default = NULL, .missing = NULL)

recode_factor(.x, ..., .default = NULL, .missing = NULL, .ordered = FALSE)

Arguments

`.x`	A vector to modify
`...`	<`dynamic-dots`> Replacements. For character and factor `.x`, these should be named and replacement is based only on their name. For numeric `.x`, these can be named or not. If not named, the replacement is done based on position i.e. `.x` represents positions to look for in replacements. See examples. When named, the argument names should be the current values to be replaced, and the argument values should be the new (replacement) values. All replacements must be the same type, and must have either length one or the same length as `.x`.
`.default`	If supplied, all values not otherwise matched will be given this value. If not supplied and if the replacements are the same type as the original values in `.x`, unmatched values are not changed. If not supplied and if the replacements are not compatible, unmatched values are replaced with `NA`. `.default` must be either length 1 or the same length as `.x`.
`.missing`	If supplied, any missing values in `.x` will be replaced by this value. Must be either length 1 or the same length as `.x`.
`.ordered`	If `TRUE`, `recode_factor()` creates an ordered factor.

Value

A vector the same length as .x, and the same type as the first of ..., .default, or .missing. recode_factor() returns a factor whose levels are in the same order as in .... The levels in .default and .missing come last.

Examples

char_vec <- sample(c("a", "b", "c"), 10, replace = TRUE)

# `recode()` is superseded by `case_match()`
recode(char_vec, a = "Apple", b = "Banana")
case_match(char_vec, "a" ~ "Apple", "b" ~ "Banana", .default = char_vec)

# With `case_match()`, you don't need typed missings like `NA_character_`
recode(char_vec, a = "Apple", b = "Banana", .default = NA_character_)
case_match(char_vec, "a" ~ "Apple", "b" ~ "Banana", .default = NA)

# Throws an error as `NA` is logical, not character.
try(recode(char_vec, a = "Apple", b = "Banana", .default = NA))

# `case_match()` is easier to use with numeric vectors, because you don't
# need to turn the numeric values into names
num_vec <- c(1:4, NA)
recode(num_vec, `2` = 20L, `4` = 40L)
case_match(num_vec, 2 ~ 20, 4 ~ 40, .default = num_vec)

# `case_match()` doesn't have the ability to match by position like
# `recode()` does with numeric vectors
recode(num_vec, "a", "b", "c", "d")
recode(c(1,5,3), "a", "b", "c", "d", .default = "nothing")

# For `case_match()`, incompatible types are an error rather than a warning
recode(num_vec, `2` = "b", `4` = "d")
try(case_match(num_vec, 2 ~ "b", 4 ~ "d", .default = num_vec))

# The factor method of `recode()` can generally be replaced with
# `forcats::fct_recode()`
factor_vec <- factor(c("a", "b", "c"))
recode(factor_vec, a = "Apple")

# `recode_factor()` does not currently have a direct replacement, but we
# plan to add one to forcats. In the meantime, you can use the `.ptype`
# argument to `case_match()`.
recode_factor(
  num_vec,
  `1` = "z",
  `2` = "y",
  `3` = "x",
  .default = "D",
  .missing = "M"
)
case_match(
  num_vec,
  1 ~ "z",
  2 ~ "y",
  3 ~ "x",
  NA ~ "M",
  .default = "D",
  .ptype = factor(levels = c("z", "y", "x", "D", "M"))
)