R: Obtain an index of all instances of values with duplicates...

dup.pairs {NCmisc}

R Documentation

Obtain an index of all instances of values with duplicates (ordered)

Description

The standard 'duplicated' function, called with which(duplicated(x)) will only return the indexes of the extra values, not the first instances. For instance in the sequence: A,B,A,C,D,B,E; it would return: 3,6. This function will also return the first instances, so in this example would give: 1,3,2,6 [note it will also be ordered]. This index can be helpful for diagnosis if duplicates are unexpected, for instance in a data.frame, and you wish to compare the differences between the rows with the duplicate values occuring. Also, duplicate values are sorted to be together in the listing, which can help for manual troubleshooting of undesired duplicates.

Usage

dup.pairs(x)

Arguments

`x`	a vector that you wish to extract duplicates from

Value

vector of indices of which values in 'x' are duplicates (including the first observed value in pairs, or sets of >2), ordered by set, then by appearance in x.

Examples

set <- c(1,1,2,2,3,4,5,6,2,2,2,2,12,1,3,3,1)
dup.pairs(set) # shows the indexes (ordered) of duplicated values
set[dup.pairs(set)] # shows the values that were duplicated (only 1's, 2's and 3's)

[Package NCmisc version 1.2.0 Index]