compareDatasets {crunch}R Documentation

Compare two datasets to see how they will append

Description

When one dataset is appended to another, variables and subvariables are matched on their aliases, and then categories for variables that have them are matched on category name. This function lines up the metadata between two datasets as the append operation will so that you can inspect how well the datasets will align before you do the append.

Usage

compareDatasets(A, B)

Arguments

A

CrunchDataset

B

CrunchDataset

Details

Calling summary on the return of this function will print an overview of places where the matching on variable alias and category name may lead to undesired outcomes, enabling you to alter one or both datasets to result in better alignment.

Value

An object of class 'compareDatasets', a list of three elements: (1) 'variables', a data.frame of variable metadata joined on alias; (2) 'categories', a list of data.frames of category metadata joined on category name, one for each variable with categories; and (3) 'subvariables', a list of data.frames of subvariable metadata joined on alias, one for each array variable.

Summary output reports on (1) variables that, when matched across datasets by alias, have different types; (2) variables that have the same name but don't match on alias; (3) for variables that match and have categories, any categories that have the same id but don't match on name; (4) for array variables that match, any subvariables that have the same name but don't match on alias; and (5) array variables that, after assembling the union of their subvariables, point to subvariables that belong to other arrays.

Examples

## Not run: 
comp <- compareDataset(ds1, ds2)
summary(comp)

## End(Not run)

[Package crunch version 1.30.4 Index]