best_duplicate {rempsyc} | R Documentation |
Choose the best duplicate
Description
Chooses the best duplicate, based on the duplicate with the smallest number of missing values. In case of ties, it picks the first duplicate, as it is the one most likely to be valid and authentic, given practice effects.
Usage
best_duplicate(data, id, keep.rows = FALSE)
Arguments
data |
The data frame. |
id |
The ID variable for which to check for duplicates. |
keep.rows |
Logical, whether to add a column at the beginning of the data frame with the original row indices. |
Details
For the easystats equivalent, see:
datawizard::data_duplicated()
.
Value
A dataframe, containing only the "best" duplicates.
Examples
df1 <- data.frame(
id = c(1, 2, 3, 1, 3),
item1 = c(NA, 1, 1, 2, 3),
item2 = c(NA, 1, 1, 2, 3),
item3 = c(NA, 1, 1, 2, 3)
)
best_duplicate(df1, id = "id", keep.rows = TRUE)
[Package rempsyc version 0.1.8 Index]