join {collapse} | R Documentation |
Fast Table Joins
Description
Join two data frame like objects x
and y
on
columns. Inspired by polars and by default uses a vectorized hash join algorithm (workhorse function fmatch
).
Usage
join(x, y,
on = NULL,
how = "left",
suffix = NULL,
validate = "m:m",
multiple = FALSE,
sort = FALSE,
keep.col.order = TRUE,
drop.dup.cols = FALSE,
verbose = .op[["verbose"]],
column = NULL,
attr = NULL,
...
)
Arguments
x |
a data frame-like object. The result will inherit the attributes of this object. |
y |
a data frame-like object to join with |
on |
character. vector of columns to join on. |
how |
character. Join type: |
suffix |
character(1 or 2). Suffix to add to duplicate column names. |
validate |
character. (Optional) check if join is of specified type. One of |
multiple |
logical. Handling of rows in |
sort |
logical. |
keep.col.order |
logical. Keep order of columns in |
drop.dup.cols |
instead of renaming duplicate columns in |
verbose |
integer. Prints information about the join. One of 0 (off), 1 (default) or 2 (additionally prints the classes of the |
column |
(optional) name for an extra column to generate in the output indicating which dataset a record came from. |
attr |
(optional) name for attribute providing information about the join performed (including the output of |
... |
further arguments to |
Value
A data frame-like object of the same type and attributes as x
. "row.names"
of x
are only preserved in left-join operations.
See Also
fmatch
, Data Frame Manipulation, Fast Grouping and Ordering, Collapse Overview
Examples
df1 <- data.frame(
id1 = c(1, 1, 2, 3),
id2 = c("a", "b", "b", "c"),
name = c("John", "Jane", "Bob", "Carl"),
age = c(35, 28, 42, 50)
)
df2 <- data.frame(
id1 = c(1, 2, 3, 3),
id2 = c("a", "b", "c", "e"),
salary = c(60000, 55000, 70000, 80000),
dept = c("IT", "Marketing", "Sales", "IT")
)
# Different types of joins
for(i in c("l","i","r","f","s","a"))
join(df1, df2, how = i) |> print()
# Adding join column: useful esp. for full join
join(df1, df2, how = "f", column = TRUE)
# Custom column + rearranging
join(df1, df2, how = "f", column = list("join", c("x", "y", "x_y")), keep = FALSE)
# Attaching match attribute
str(join(df1, df2, attr = TRUE))