merge_cols {eHDPrep}R Documentation

Merge columns in data frame

Description

Merges two columns in a single data frame. The merging draws on the functionality of 'dplyr''s coalesce where missing values from one vector are replaced by corresponding values in a second variable. The name of the merged variable is specified in merge_var_name. primary_var and secondary_var can be removed with rm_in_vars. Variables must be combinable (i.e. not a combination of numeric and character).

Usage

merge_cols(
  data,
  primary_var,
  secondary_var,
  merge_var_name = NULL,
  rm_in_vars = FALSE
)

Arguments

data

data frame containing primary_var and secondary_var.

primary_var

Data variable which contains the best quality / most detailed information. Missing values will be supplied by values in corresponding rows from secondary_var.

secondary_var

Data variable which will be used to fill missing values in primary_var.

merge_var_name

character constant. Name for merged variable. Default: [primary_var]_[secondary_var]_merged

rm_in_vars

logical constant. Should primary_var and secondary_var be removed? Default = FALSE.

Value

data frame with coalesced primary_var and secondary_var

See Also

coalesce

Examples

data(example_data)

# preserve input variables (default)
res <- merge_cols(example_data, diabetes_type, diabetes)
dplyr::select(res, dplyr::starts_with("diabetes"))

# remove input variables
res <- merge_cols(example_data, diabetes_type, diabetes, rm_in_vars = TRUE)
dplyr::select(res, dplyr::starts_with("diabetes"))


[Package eHDPrep version 1.3.3 Index]