| join.tbl_sql {dbplyr} | R Documentation | 
Join SQL tables
Description
These are methods for the dplyr join generics. They are translated to the following SQL queries:
-  inner_join(x, y):SELECT * FROM x JOIN y ON x.a = y.a
-  left_join(x, y):SELECT * FROM x LEFT JOIN y ON x.a = y.a
-  right_join(x, y):SELECT * FROM x RIGHT JOIN y ON x.a = y.a
-  full_join(x, y):SELECT * FROM x FULL JOIN y ON x.a = y.a
-  semi_join(x, y):SELECT * FROM x WHERE EXISTS (SELECT 1 FROM y WHERE x.a = y.a)
-  anti_join(x, y):SELECT * FROM x WHERE NOT EXISTS (SELECT 1 FROM y WHERE x.a = y.a)
Usage
## S3 method for class 'tbl_lazy'
inner_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = NULL,
  ...,
  keep = NULL,
  na_matches = c("never", "na"),
  multiple = NULL,
  unmatched = "drop",
  relationship = NULL,
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
left_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = NULL,
  ...,
  keep = NULL,
  na_matches = c("never", "na"),
  multiple = NULL,
  unmatched = "drop",
  relationship = NULL,
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
right_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = NULL,
  ...,
  keep = NULL,
  na_matches = c("never", "na"),
  multiple = NULL,
  unmatched = "drop",
  relationship = NULL,
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
full_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = NULL,
  ...,
  keep = NULL,
  na_matches = c("never", "na"),
  multiple = NULL,
  relationship = NULL,
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
cross_join(
  x,
  y,
  ...,
  copy = FALSE,
  suffix = c(".x", ".y"),
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
semi_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  ...,
  na_matches = c("never", "na"),
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
## S3 method for class 'tbl_lazy'
anti_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  ...,
  na_matches = c("never", "na"),
  sql_on = NULL,
  auto_index = FALSE,
  x_as = NULL,
  y_as = NULL
)
Arguments
| x,y | A pair of lazy data frames backed by database queries. | 
| by | A join specification created with  If  To join on different variables between  To join by multiple variables, use a  
 For simple equality joins, you can alternatively specify a character vector
of variable names to join by. For example,  To perform a cross-join, generating all combinations of  | 
| copy | If  This allows you to join tables across srcs, but it's potentially expensive operation so you must opt into it. | 
| suffix | If there are non-joined duplicate variables in  | 
| ... | Other parameters passed onto methods. | 
| keep | Should the join keys from both  
 | 
| na_matches | Should NA (NULL) values match one another?
The default, "never", is how databases usually work.  | 
| multiple,unmatched | Unsupported in database backends. As a workaround for multiple use a unique key and for unmatched a foreign key constraint. | 
| relationship | Unsupported in database backends. | 
| sql_on | A custom join predicate as an SQL expression.
Usually joins use column equality, but you can perform more complex
queries by supply  | 
| auto_index | if  | 
| x_as,y_as | Alias to use for  | 
Value
Another tbl_lazy. Use show_query() to see the generated
query, and use collect() to execute the query
and return data to R.
Examples
library(dplyr, warn.conflicts = FALSE)
band_db <- tbl_memdb(dplyr::band_members)
instrument_db <- tbl_memdb(dplyr::band_instruments)
band_db %>% left_join(instrument_db) %>% show_query()
# Can join with local data frames by setting copy = TRUE
band_db %>%
  left_join(dplyr::band_instruments, copy = TRUE)
# Unlike R, joins in SQL don't usually match NAs (NULLs)
db <- memdb_frame(x = c(1, 2, NA))
label <- memdb_frame(x = c(1, NA), label = c("one", "missing"))
db %>% left_join(label, by = "x")
# But you can activate R's usual behaviour with the na_matches argument
db %>% left_join(label, by = "x", na_matches = "na")
# By default, joins are equijoins, but you can use `sql_on` to
# express richer relationships
db1 <- memdb_frame(x = 1:5)
db2 <- memdb_frame(x = 1:3, y = letters[1:3])
db1 %>% left_join(db2) %>% show_query()
db1 %>% left_join(db2, sql_on = "LHS.x < RHS.x") %>% show_query()