interval_join {fuzzyjoin} | R Documentation |
Join two tables based on overlapping (low, high) intervals
Description
Joins tables based on overlapping intervals: for example, joining
the row (1, 4) with (3, 6), but not with (5, 10). This operation is sped up
using interval trees as implemented in the IRanges package. You
can specify particular relationships between intervals (such as a maximum gap,
or a minimum overlap) through arguments passed on to
findOverlaps
. See that documentation for descriptions
of such arguments.
Usage
interval_join(x, y, by, mode = "inner", ...)
interval_inner_join(x, y, by = NULL, ...)
interval_left_join(x, y, by = NULL, ...)
interval_right_join(x, y, by = NULL, ...)
interval_full_join(x, y, by = NULL, ...)
interval_semi_join(x, y, by = NULL, ...)
interval_anti_join(x, y, by = NULL, ...)
Arguments
x |
A tbl |
y |
A tbl |
by |
Columns by which to join the two tables. If provided, this must be two columns: start of interval, then end of interval |
mode |
One of "inner", "left", "right", "full" "semi", or "anti" |
... |
Extra arguments passed on to |
Details
This allows joining on date or datetime intervals. It throws an error if the type of date/datetime disagrees between the two tables.
This requires the IRanges package from Bioconductor. See here for installation: https://bioconductor.org/packages/release/bioc/html/IRanges.html.
Examples
if (requireNamespace("IRanges", quietly = TRUE)) {
x1 <- data.frame(id1 = 1:3, start = c(1, 5, 10), end = c(3, 7, 15))
x2 <- data.frame(id2 = 1:3, start = c(2, 4, 16), end = c(4, 8, 20))
interval_inner_join(x1, x2)
# Allow them to be separated by a gap with a maximum:
interval_inner_join(x1, x2, maxgap = 1) # let 1 join with 2
interval_inner_join(x1, x2, maxgap = 20) # everything joins each other
# Require that they overlap by more than a particular amount
interval_inner_join(x1, x2, minoverlap = 3)
# other types of joins:
interval_full_join(x1, x2)
interval_left_join(x1, x2)
interval_right_join(x1, x2)
interval_semi_join(x1, x2)
interval_anti_join(x1, x2)
}