R: Extract Duplicated or Unique Rows

df.duplicated {misty}

R Documentation

Extract Duplicated or Unique Rows

Description

The function df.duplicated extracts duplicated rows and the function df.unique extracts unique rows from a matrix or data frame.

Usage

df.duplicated(..., data, first = TRUE, keep.all = TRUE, from.last = FALSE,
              keep.row.names = TRUE, check = TRUE)

df.unique(..., data, keep.all = TRUE, from.last = FALSE,
          keep.row.names = TRUE, check = TRUE)

Arguments

`...`	an expression indicating the variable names in `data` used to determine duplicated or unique rows.e.g., `df.duplicated(x1, x2, data = dat)`. Note that the operators `.`, `+`, `-`, `~`, `:`, `::`, and `!` can also be used to select variables, see Details in the `df.subset` function.
`data`	a data frame.
`first`	logical: if `TRUE` (default), the `df.duplicated()` function will return duplicated rows including the first of identical rows.
`keep.all`	logical: if `TRUE` (default), the function will return all variables in `x` after extracting duplicated or unique rows based on the variables specified in the argument `...`.
`from.last`	logical: if `TRUE`, duplication will be considered from the reversed side, i.e., the last of identical rows would correspond to `duplicated = FALSE`. Note that this argument is only used when `first = FALSE`.
`keep.row.names`	logical: if `TRUE` (default), the row names from `x` are kept, otherwise they are set to `NULL`.
`check`	logical: if `TRUE` (default), argument specification is checked.

Details

Note that df.unique(x) is equivalent to unique(x). That is, the main difference between the df.unique() and the unique() function is that the df.unique() function provides the ... argument to specify a variable or multiple variables which are used to determine unique rows.

Value

Returns duplicated or unique rows of the data frame in ... or data.

Author(s)

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

Examples

dat <- data.frame(x1 = c(1, 1, 2, 1, 4),
                  x2 = c(1, 1, 2, 1, 6),
                  x3 = c(2, 2, 3, 2, 6),
                  x4 = c(1, 1, 2, 2, 4),
                  x5 = c(1, 1, 4, 4, 3))

#-------------------------------------------------------------------------------
# df.duplicated() function

# Example 1: Extract duplicated rows based on all variables
df.duplicated(., data = dat)

# Example 2: Extract duplicated rows based on x4
df.duplicated(x4, data = dat)

# Example 3: Extract duplicated rows based on x2 and x3
df.duplicated(x2, x3, data = dat)

# Example 4: Extract duplicated rows based on all variables
# exclude first of identical rows
df.duplicated(., data = dat, first = FALSE)

# Example 5: Extract duplicated rows based on x2 and x3
# do not return all variables
df.duplicated(x2, x3, data = dat, keep.all = FALSE)

# Example 6: Extract duplicated rows based on x4
# consider duplication from the reversed side
df.duplicated(x4, data = dat, first = FALSE, from.last = TRUE)

# Example 7: Extract duplicated rows based on x2 and x3
# set row names to NULL
df.duplicated(x2, x3, data = dat, keep.row.names = FALSE)

#-------------------------------------------------------------------------------
# df.unique() function

# Example 8: Extract unique rows based on all variables
df.unique(., data = dat)

# Example 9: Extract unique rows based on x4
df.unique(x4, data = dat)

# Example 10: Extract unique rows based on x1, x2, and x3
df.unique(x1, x2, x3, data = dat)

# Example 11: Extract unique rows based on x2 and x3
# do not return all variables
df.unique(x2, x3, data = dat, keep.all = FALSE)

# Example 12: Extract unique rows based on x4
# consider duplication from the reversed side
df.unique(x4, data = dat, from.last = TRUE)

# Example 13: Extract unique rows based on x2 and x3
# set row names to NULL
df.unique(x2, x3, data = dat, keep.row.names = FALSE)

[Package misty version 0.6.5 Index]