count_compare {eHDPrep} | R Documentation |
Compare unique values before and after data modification
Description
Performs comparison of variables before and after a change has been applied in order to allow manual inspection and review of modifications made during the dataset preparation process.
Usage
count_compare(
cols2compare,
before_tbl = NULL,
after_tbl = NULL,
only_diff = FALSE,
kableout = TRUE,
caption = NULL,
latex_wrap = FALSE
)
Arguments
cols2compare |
Variables to compare between tables. |
before_tbl |
Data frame from before modification was made. |
after_tbl |
Data frame from after modification was made. |
only_diff |
Keep only rows which differ between the tables (good for variables with many unique values, such as numeric variables). |
kableout |
Should output be a |
caption |
Caption for |
latex_wrap |
Should tables be aligned vertically rather than horizontally? Useful for wide table which would otherwise run off a page in LaTeX format. |
Details
The purpose of this function is to summarise individual alterations in a
dataset and works best with categorical variables. The output contains two
tables derived from the parameters before_tbl
and after_tbl
.
Each table shows the unique combinations of values in variables specified in
the parameter cols2compare
if the variable is present. The tables are
presented as two sub-tables and therefore share a single table caption. This
caption is automatically generated describing the content of the two
sub-tables when the parameter caption
is not specified. The
default output is a kable
containing two sub-kables however if the
parameter kableout
is FALSE
, a list containing the two
tibble
s are returned. This may preferable for further analysis on the
tables' contents.
Value
Returns list of two tibbles or a kable (see kableout
argument), each tallying unique values in specified columns in each input
table.
Examples
# merge data as the example modification
example_data_merged <- merge_cols(example_data, diabetes_type, diabetes,
"diabetes_merged", rm_in_vars = TRUE)
# review the differences between the input and output of the variable merging step above:
count_compare(before_tbl = example_data,
after_tbl = example_data_merged,
cols2compare = c("diabetes", "diabetes_type", "diabetes_merged"),
kableout = FALSE)