compareDiff {clinUtils} | R Documentation |
Get differences between two data.frames
Description
Get differences between two data.frames
Usage
compareDiff(
newData,
oldData,
referenceVars = intersect(colnames(newData), colnames(oldData)),
changeableVars = NULL
)
Arguments
newData |
data.frame object representing the new data |
oldData |
data.frame object representing the old data |
referenceVars |
character vector of the columns in the data that are the used as
reference for the comparison. |
changeableVars |
character vector of the columns in the data for which you want to assess the change,
e.g. variables that might have changed from the old to the new data. |
Value
Object of class 'diff.data', i.e. a data.frame with columns:
'Comparison type': type of difference between the old and new data, either:
'Change': records present both in new and old data, based on the reference variables, but with difference(s) in changeable vars
'Addition': records with reference variables present in new but not in old data
'Removal': records with reference variables present in old but not in new data
'Version': 'Previous' or 'Current' depending if record represents content from old or new data respectively
-
referenceVars
-
changeableVars
Identification of the differences between datasets
To identify the differences between datasets, the following steps are followed:
removal of records identical between the old and new dataset (will be considered as 'Identical' later on)
records with a reference value present in the old dataset but not in the new dataset are considered 'Removal'
records with a reference value present in the new dataset but not in the old dataset are considered 'Addition'
records with reference value present both in the new and old dataset, after filtering of identical records and with difference in the changeable variables are considered 'Change'
Author(s)
Laure Cougnaud