compare_df {metaConvert} | R Documentation |
Flag the differences between two dataframes.
Description
Flag the differences between two dataframes.
Usage
compare_df(
df_extractor_1,
df_extractor_2,
ordering_columns = NULL,
tolerance = 0,
tolerance_type = "ratio",
output = "html",
file_name = "comparison.xlsx"
)
Arguments
df_extractor_1 |
a first dataset. Differences with the second dataset will be flagged in green. |
df_extractor_2 |
a second dataset. Differences with the first dataset will be flagged in red. |
ordering_columns |
column names that should be used to re-order the two datasets before running the comparisons |
tolerance |
the cut-off value used to flag differences between two numeric values |
tolerance_type |
must be either 'difference' or 'ratio' |
output |
type of object returned by the function (see 'Value' section). Must be either 'wide', 'long', 'html', 'html2' or 'xlsx'. |
file_name |
the name of the generated file (only used when |
Details
This function aims to facilitate the comparison of two datasets created by blind data extractors during a systematic review. It is a wrapper of several functions from the 'compareDF' package.
Value
This function returns a dataframe composed of the rows that include a difference (all identical rows are removed). Several outputs can be requested :
setting
output="xlsx"
returns an excel file. A message indicates the location of the generated file on your computer.setting
output="html"
returns an html filesetting
output="html2"
returns an html file (only useful when the "html" command did not make the html pane appear in R studio).setting
output="wide"
a wide dataframesetting
output="long"
a long dataframe
References
Alex Joseph (2022). compareDF: Do a Git Style Diff of the Rows Between Two Dataframes with Similar Structure. R package version 2.3.3. https://CRAN.R-project.org/package=compareDF
Examples
df.compare1 = df.compare1[order(df.compare1$author), ]
df.compare2 = df.compare2[order(df.compare2$year), ]
names(df.compare1)[2] <- "generate_warning"
compare_df(
df_extractor_1 = df.compare1,
df_extractor_2 = df.compare2,
ordering_columns = c("study_id")
)