anti_join {joyn} | R Documentation |
Anti join on two data frames
Description
This is a joyn
wrapper that works in a similar fashion to
dplyr::anti_join
Usage
anti_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
relationship = "many-to-many",
y_vars_to_keep = FALSE,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
full_join()
,
inner_join()
,
left_join()
,
right_join()
Examples
# Simple anti join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
anti_join(x1, y1, relationship = "many-to-one")