R: Merge columns in data frame

merge_cols {eHDPrep}

R Documentation

Merge columns in data frame

Description

Merges two columns in a single data frame. The merging draws on the functionality of 'dplyr''s coalesce where missing values from one vector are replaced by corresponding values in a second variable. The name of the merged variable is specified in merge_var_name. primary_var and secondary_var can be removed with rm_in_vars. Variables must be combinable (i.e. not a combination of numeric and character).

Usage

merge_cols(
  data,
  primary_var,
  secondary_var,
  merge_var_name = NULL,
  rm_in_vars = FALSE
)

Arguments

`data`	data frame containing `primary_var` and `secondary_var`.
`primary_var`	Data variable which contains the best quality / most detailed information. Missing values will be supplied by values in corresponding rows from `secondary_var`.
`secondary_var`	Data variable which will be used to fill missing values in `primary_var`.
`merge_var_name`	character constant. Name for merged variable. Default: [`primary_var`]_[`secondary_var`]_merged
`rm_in_vars`	logical constant. Should `primary_var` and `secondary_var` be removed? Default = FALSE.

Value

data frame with coalesced primary_var and secondary_var

Examples

data(example_data)

# preserve input variables (default)
res <- merge_cols(example_data, diabetes_type, diabetes)
dplyr::select(res, dplyr::starts_with("diabetes"))

# remove input variables
res <- merge_cols(example_data, diabetes_type, diabetes, rm_in_vars = TRUE)
dplyr::select(res, dplyr::starts_with("diabetes"))