distinct {poorman} | R Documentation |
Subset distinct/unique rows
Description
Select only distinct/unique rows from a data.frame
.
Usage
distinct(.data, ..., .keep_all = FALSE)
Arguments
.data |
A |
... |
Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables. |
.keep_all |
|
Value
A data.frame
with the following properties:
Rows are a subset of the input but appear in the same order.
Columns are not modified if
...
is empty or.keep_all
isTRUE
. Otherwise,distinct()
first callsmutate()
to create new columns.Groups are not modified.
-
data.frame
attributes are preserved.
Examples
df <- data.frame(
x = sample(10, 100, rep = TRUE),
y = sample(10, 100, rep = TRUE)
)
nrow(df)
nrow(distinct(df))
nrow(distinct(df, x, y))
distinct(df, x)
distinct(df, y)
# You can choose to keep all other variables as well
distinct(df, x, .keep_all = TRUE)
distinct(df, y, .keep_all = TRUE)
# You can also use distinct on computed variables
distinct(df, diff = abs(x - y))
# The same behaviour applies for grouped data frames,
# except that the grouping variables are always included
df <- data.frame(
g = c(1, 1, 2, 2),
x = c(1, 1, 2, 1)
) %>% group_by(g)
df %>% distinct(x)
[Package poorman version 0.2.7 Index]