dup.pairs {NCmisc} | R Documentation |
Obtain an index of all instances of values with duplicates (ordered)
Description
The standard 'duplicated' function, called with which(duplicated(x)) will only return the indexes of the extra values, not the first instances. For instance in the sequence: A,B,A,C,D,B,E; it would return: 3,6. This function will also return the first instances, so in this example would give: 1,3,2,6 [note it will also be ordered]. This index can be helpful for diagnosis if duplicates are unexpected, for instance in a data.frame, and you wish to compare the differences between the rows with the duplicate values occuring. Also, duplicate values are sorted to be together in the listing, which can help for manual troubleshooting of undesired duplicates.
Usage
dup.pairs(x)
Arguments
x |
a vector that you wish to extract duplicates from |
Value
vector of indices of which values in 'x' are duplicates (including the first observed value in pairs, or sets of >2), ordered by set, then by appearance in x.
Examples
set <- c(1,1,2,2,3,4,5,6,2,2,2,2,12,1,3,3,1)
dup.pairs(set) # shows the indexes (ordered) of duplicated values
set[dup.pairs(set)] # shows the values that were duplicated (only 1's, 2's and 3's)