get_dupes {janitor} | R Documentation |
Get rows of a data.frame
with identical values for the specified variables.
Description
For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.
Usage
get_dupes(dat, ...)
Arguments
dat |
The input data.frame. |
... |
Unquoted variable names to search for duplicates. This takes a tidyselect specification. |
Value
Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count
showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df
, the output is as well.
Examples
get_dupes(mtcars, mpg, hp)
# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)
# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))
[Package janitor version 2.2.0 Index]