pair_blocking {reclin2} | R Documentation |
Generate pairs using simple blocking
Description
Generates all combinations of records from x
and y
where the
blocking variables are equal.
Usage
pair_blocking(x, y, on, deduplication = FALSE, add_xy = TRUE)
Arguments
x |
first |
y |
second |
on |
the variables defining the blocks or strata for which
all pairs of |
deduplication |
generate pairs from only |
add_xy |
add |
Details
Generating (all) pairs of the records of two data sets, is usually the first step when linking the two data sets. However, this often results in a too large number of records. Therefore, blocking is usually applied.
Value
A data.table
with two columns,
.x
and .y
, is returned. Columns .x
and .y
are
row numbers from data.frame
s .x
and .y
respectively.
See Also
pair
and pair_minsim
are other methods
to generate pairs.
Examples
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")