Inner, Left, Right, Outer, Semi and Anti Join for Data Tables


These helper functions perform join operations on data tables. Most of them are basically one-liners. See for a overview of join operations in data table or alternatively dplyr's vignette on two table verbs.


ijoin(x, y, by = NULL)

ljoin(x, y, by = NULL)

rjoin(x, y, by = NULL)

ojoin(x, y, by = NULL)

sjoin(x, y, by = NULL)

ajoin(x, y, by = NULL)

ujoin(x, y, all.y = FALSE, by = NULL)



First data.frame to join.


Second data.frame to join.


Column name(s) of variables used to match rows in x and y. If not provided, a heuristic similar to the one described in the dplyr vignette is used:

  1. If x is keyed, the existing key will be used if y has the same column(s).

  2. If x is not keyed, the intersect of common columns names is used if not empty.

  3. Raise an exception.

You may pass a named character vector to merge on columns with different names in x and y: by = c("" = "") will match x's “” column with y\'s “” column.


Keep columns of y which are not in x?


[data.table] with key identical to by.


# Create two tables for demonstration
tmp = makeRegistry(file.dir = NA, make.default = FALSE)
batchMap(identity, x = 1:6, reg = tmp)
x = getJobPars(reg = tmp)
y = findJobs(x >= 2 & x <= 5, reg = tmp)
y$extra.col = head(letters, nrow(y))

# Inner join: similar to intersect(): keep all columns of x and y with common matches
ijoin(x, y)

# Left join: use all ids from x, keep all columns of x and y
ljoin(x, y)

# Right join: use all ids from y, keep all columns of x and y
rjoin(x, y)

# Outer join: similar to union(): keep all columns of x and y with matches in x or y
ojoin(x, y)

# Semi join: filter x with matches in y
sjoin(x, y)

# Anti join: filter x with matches not in y
ajoin(x, y)

# Updating join: Replace values in x with values in y
ujoin(x, y)

