| pair_blocking {reclin2} | R Documentation |
Generate pairs using simple blocking
Description
Generates all combinations of records from x and y where the
blocking variables are equal.
Usage
pair_blocking(x, y, on, deduplication = FALSE, add_xy = TRUE)
Arguments
x |
first |
y |
second |
on |
the variables defining the blocks or strata for which
all pairs of |
deduplication |
generate pairs from only |
add_xy |
add |
Details
Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets. However, this often results in a too large number of records. Therefore, blocking is usually applied.
Value
A data.table with two columns,
.x and .y, is returned. Columns .x and .y are
row numbers from data.frames .x and .y respectively.
See Also
pair and pair_minsim are other methods
to generate pairs.
Examples
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")