asof_left_join {sparklyr.flint}R Documentation

Temporal left join

Description

Perform left-outer join on 2 'TimeSeriesRDD's based on inexact timestamp matches, where each record from 'left' with timestamp 't' matches the record from 'right' having the most recent timestamp at or before 't'. Notice this is equivalent to 'asof_join()' with 'direction' = "<=". See asof_join.

Usage

asof_left_join(
  left,
  right,
  tol = "0ms",
  key_columns = list(),
  left_prefix = NULL,
  right_prefix = NULL
)

Arguments

left

The left 'TimeSeriesRDD'

right

The right 'TimeSeriesRDD'

tol

A character vector specifying a time duration (e.g., "0ns", "5ms", "5s", "1d", etc) as the tolerance for absolute difference in timestamp values between each record from 'left' and its matching record from 'right'. By default, 'tol' is "0ns", which means a record from 'left' will only be matched with a record from 'right' if both contain the exact same timestamps.

key_columns

Columns to be used as the matching key among records from 'left' and 'right': if non-empty, then in addition to matching criteria imposed by timestamps, a record from 'left' will only match one from the 'right' only if they also have equal values in all key columns.

left_prefix

A string to prepend to all columns from 'left' after the join (usually for disambiguation purposes if 'left' and 'right' contain overlapping column names).

right_prefix

A string to prepend to all columns from 'right' after the join (usually for disambiguation purposes if 'left' and 'right' contain overlapping column names).

See Also

Other Temporal join functions: asof_future_left_join(), asof_join()

Examples


library(sparklyr)
library(sparklyr.flint)

sc <- try_spark_connect(master = "local")
if (!is.null(sc)) {
  ts_1 <- copy_to(sc, tibble::tibble(t = seq(10), u = seq(10))) %>%
    from_sdf(is_sorted = TRUE, time_unit = "SECONDS", time_column = "t")
  ts_2 <- copy_to(sc, tibble::tibble(t = seq(10) + 1, v = seq(10) + 1L)) %>%
    from_sdf(is_sorted = TRUE, time_unit = "SECONDS", time_column = "t")
  left_join_ts <- asof_left_join(ts_1, ts_2, tol = "1s")
} else {
  message("Unable to establish a Spark connection!")
}


[Package sparklyr.flint version 0.2.2 Index]