R: Duplicated rows in datasets

checkdupl {rchemo}

R Documentation

Duplicated rows in datasets

Description

Finding and removing duplicated row observations in datasets.

Usage


checkdupl(X, Y = NULL, digits = NULL)

Arguments

`X`	A dataset.
`Y`	A dataset compared to `X`.
`digits`	The number of digits when rounding the data before the duplication test. Default to `NULL` (no rounding.

Value

a dataframe with the row numbers in the first and second datasets that are identical, and the values of the variables.

Examples


X1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE)
dimnames(X1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

X2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE)
dimnames(X2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

X1
X2

checkdupl(X1, X2)

checkdupl(X1)

checkdupl(matrix(rnorm(20), nrow = 5))

res <- checkdupl(X1)
s <- unique(res$rownum2)
zX1 <- X1[-s, ]
zX1

[Package rchemo version 0.1-2 Index]