count_compare {eHDPrep} R Documentation

## Compare unique values before and after data modification

### Description

Performs comparison of variables before and after a change has been applied in order to allow manual inspection and review of modifications made during the dataset preparation process.

### Usage

count_compare(
cols2compare,
before_tbl = NULL,
after_tbl = NULL,
only_diff = FALSE,
kableout = TRUE,
caption = NULL,
latex_wrap = FALSE
)


### Arguments

 cols2compare Variables to compare between tables. before_tbl Data frame from before modification was made. after_tbl Data frame from after modification was made. only_diff Keep only rows which differ between the tables (good for variables with many unique values, such as numeric variables). kableout Should output be a kable from knitr? If not, returns a tibble. (Default: TRUE) caption Caption for kable's caption parameter. latex_wrap Should tables be aligned vertically rather than horizontally? Useful for wide table which would otherwise run off a page in LaTeX format.

### Details

The purpose of this function is to summarise individual alterations in a dataset and works best with categorical variables. The output contains two tables derived from the parameters before_tbl and after_tbl. Each table shows the unique combinations of values in variables specified in the parameter cols2compare if the variable is present. The tables are presented as two sub-tables and therefore share a single table caption. This caption is automatically generated describing the content of the two sub-tables when the parameter caption is not specified. The default output is a kable containing two sub-kables however if the parameter kableout is FALSE, a list containing the two tibbles are returned. This may preferable for further analysis on the tables' contents.

### Value

Returns list of two tibbles or a kable (see kableout argument), each tallying unique values in specified columns in each input table.

### Examples

# merge data as the example modification
example_data_merged <- merge_cols(example_data, diabetes_type, diabetes,
"diabetes_merged", rm_in_vars = TRUE)

# review the differences between the input and output of the variable merging step above:
count_compare(before_tbl = example_data,
after_tbl = example_data_merged,
cols2compare = c("diabetes", "diabetes_type", "diabetes_merged"),
kableout = FALSE)


[Package eHDPrep version 1.2.1 Index]