getMatches {fastLink}R Documentation

getMatches

Description

Subset two data frames to the matches returned by fastLink() or matchesLink(). Can also return a single deduped data frame if dfA and dfB are identical and fl.out is of class 'fastLink.dedupe'.

Usage

getMatches(dfA, dfB, fl.out, threshold.match, combine.dfs)

Arguments

dfA

Dataset A - matched to Dataset B by fastLink().

dfB

Dataset B - matches to Dataset A by fastLink().

fl.out

Either the output from fastLink() or matchesLink().

threshold.match

A number between 0 and 1 indicating the lower bound that the user wants to declare a match. For instance, threshold.match = .85 will return all pairs with posterior probability greater than .85 as matches. Default is 0.85.

combine.dfs

Whether to combine the two data frames being merged into a single data frame. If FALSE, two data frames are returned in a list. Default is TRUE.

Value

getMatches() returns a list of two data frames:

dfA.match

A subset of dfA subsetted down to the successful matches.

dfB.match

A subset of dfB subsetted down to the successful matches.

Author(s)

Ben Fifield <benfifield@gmail.com>

Examples

## Not run: 
fl.out <- fastLink(dfA, dfB,
varnames = c("firstname", "lastname", "streetname", "birthyear"),
n.cores = 1)
ret <- getMatches(dfA, dfB, fl.out)

## End(Not run)

[Package fastLink version 0.6.1 Index]