R: Compare unique values before and after data modification

count_compare {eHDPrep}

R Documentation

Compare unique values before and after data modification

Description

Performs comparison of variables before and after a change has been applied in order to allow manual inspection and review of modifications made during the dataset preparation process.

Usage

count_compare(
  cols2compare,
  before_tbl = NULL,
  after_tbl = NULL,
  only_diff = FALSE,
  kableout = TRUE,
  caption = NULL,
  latex_wrap = FALSE
)

Arguments

`cols2compare`	Variables to compare between tables.
`before_tbl`	Data frame from before modification was made.
`after_tbl`	Data frame from after modification was made.
`only_diff`	Keep only rows which differ between the tables (good for variables with many unique values, such as numeric variables).
`kableout`	Should output be a `kable` from `knitr`? If not, returns a `tibble`. (Default: TRUE)
`caption`	Caption for `kable`'s `caption` parameter.
`latex_wrap`	Should tables be aligned vertically rather than horizontally? Useful for wide table which would otherwise run off a page in LaTeX format.

Details

The purpose of this function is to summarise individual alterations in a dataset and works best with categorical variables. The output contains two tables derived from the parameters before_tbl and after_tbl. Each table shows the unique combinations of values in variables specified in the parameter cols2compare if the variable is present. The tables are presented as two sub-tables and therefore share a single table caption. This caption is automatically generated describing the content of the two sub-tables when the parameter caption is not specified. The default output is a kable containing two sub-kables however if the parameter kableout is FALSE, a list containing the two tibbles are returned. This may preferable for further analysis on the tables' contents.

Value

Returns list of two tibbles or a kable (see kableout argument), each tallying unique values in specified columns in each input table.

Examples

# merge data as the example modification
example_data_merged <- merge_cols(example_data, diabetes_type, diabetes, 
"diabetes_merged", rm_in_vars = TRUE)

# review the differences between the input and output of the variable merging step above:
count_compare(before_tbl = example_data,
              after_tbl = example_data_merged,
                            cols2compare = c("diabetes", "diabetes_type", "diabetes_merged"),
                            kableout = FALSE)

[Package eHDPrep version 1.3.3 Index]