break_join {sift}R Documentation

Join tables based on overlapping intervals.

Description

User-friendly interface that synthesizes power of dplyr::left_join and findInterval.

Usage

break_join(x, y, brk = character(), by = NULL, ...)

Arguments

x

A data frame.

y

Data frame containing desired reference information.

brk

Name of column in x and y to join by via interval overlapping. Must be coercible to numeric.

by

Joining variables, if needed. See mutate-joins.

...

additional arguments automatically directed to findInterval and dplyr::left_join. No partial matching.

Value

An object of the same type as x.

Examples

# joining USA + UK leaders with population time-series
break_join(us_uk_pop, us_uk_leaders, brk = c("date" = "start"))

# simple dataset
set.seed(1)
a <- data.frame(p = c(rep("A", 10), rep("B", 10)), q = runif(20, 0, 10))
b <- data.frame(p = c("A", "A", "B", "B"), q = c(3, 5, 6, 9), r = c("a1", "a2", "b1", "b2"))

break_join(a, b, brk = "q") # p identified as common variable automatically
break_join(a, b, brk = "q", by = "p") # same result
break_join(a, b, brk = "q", all.inside = TRUE) # note missing values have been filled

# joining toll prices with vehicle time-series

library(mopac)
library(dplyr, warn.conflicts = FALSE)
library(hms)

express %>%
  mutate(time_hms = as_hms(time)) %>%
  break_join(rates, brk = c("time_hms" = "time"))

[Package sift version 0.1.0 Index]