count_compare {eHDPrep}R Documentation

Compare unique values before and after data modification


Performs comparison of variables before and after a change has been applied in order to allow manual inspection and review of modifications made during the dataset preparation process.


  before_tbl = NULL,
  after_tbl = NULL,
  only_diff = FALSE,
  kableout = TRUE,
  caption = NULL,
  latex_wrap = FALSE



Variables to compare between tables.


Data frame from before modification was made.


Data frame from after modification was made.


Keep only rows which differ between the tables (good for variables with many unique values, such as numeric variables).


Should output be a kable from knitr? If not, returns a tibble. (Default: TRUE)


Caption for kable's caption parameter.


Should tables be aligned vertically rather than horizontally? Useful for wide table which would otherwise run off a page in LaTeX format.


The purpose of this function is to summarise individual alterations in a dataset and works best with categorical variables. The output contains two tables derived from the parameters before_tbl and after_tbl. Each table shows the unique combinations of values in variables specified in the parameter cols2compare if the variable is present. The tables are presented as two sub-tables and therefore share a single table caption. This caption is automatically generated describing the content of the two sub-tables when the parameter caption is not specified. The default output is a kable containing two sub-kables however if the parameter kableout is FALSE, a list containing the two tibbles are returned. This may preferable for further analysis on the tables' contents.


Returns list of two tibbles or a kable (see kableout argument), each tallying unique values in specified columns in each input table.


# merge data as the example modification
example_data_merged <- merge_cols(example_data, diabetes_type, diabetes, 
"diabetes_merged", rm_in_vars = TRUE)

# review the differences between the input and output of the variable merging step above:
count_compare(before_tbl = example_data,
              after_tbl = example_data_merged,
                            cols2compare = c("diabetes", "diabetes_type", "diabetes_merged"),
                            kableout = FALSE)

[Package eHDPrep version 1.2.1 Index]