R: Rows with distinct combinations of columns

distinct-table.express {table.express}

R Documentation

Rows with distinct combinations of columns

Description

Rows with distinct combinations of columns

Usage

## S3 method for class 'ExprBuilder'
distinct(
  .data,
  ...,
  .keep = TRUE,
  .n = 1L,
  .parse = getOption("table.express.parse", FALSE)
)

## S3 method for class 'data.table'
distinct(.data, ...)

Arguments

`.data`	An instance of ExprBuilder.
`...`	Which columns to use to determine uniqueness.
`.keep`	See details below.
`.n`	Indices of rows to return for each unique combination of the chosen columns. See details.
`.parse`	Logical. Whether to apply `rlang::parse_expr()` to obtain the expressions.

Details

If .keep = TRUE (the default), the columns not mentioned in ... are also kept. However, if a new column is created in one of the expressions therein, .keep can also be set to a character vector containing the names of all the columns that should be in the result in addition to the ones mentioned in .... See the examples.

The value of .n is only relevant when .keep is not FALSE. It is used to subset .SD in the built data.table expression. For example, we could get 2 rows per combination by setting .n to 1:2, or get the last row instead of the first by using .N. If more than one index is used, and not enough rows are found, some rows will have NA. Do note that, at least as of version 1.12.2 of data.table, only expressions with single indices are internally optimized.

To see more examples, check the vignette, or the table.express-package entry.

Examples


data("mtcars")

# compare with .keep = TRUE
data.table::as.data.table(mtcars) %>%
    distinct(amvs = am + vs, .keep = names(mtcars))

[Package table.express version 0.4.2 Index]