duplicate_count_colpair {scrutiny}R Documentation

Count duplicate values by column

Description

duplicate_count_colpair() takes a data frame and checks each combination of columns for duplicates. Results are presented in a tibble, ordered by the number of duplicates.

Usage

duplicate_count_colpair(
  data,
  ignore = NULL,
  show_rates = TRUE,
  na.rm = deprecated()
)

Arguments

data

Data frame.

ignore

Optionally, a vector of values that should not be checked for duplicates.

show_rates

Logical. If TRUE (the default), adds columns rate_x and rate_y. See value section. Set show_rates to FALSE for higher performance.

na.rm

[Deprecated] Missing values are never counted in any case.

Value

A tibble (data frame) with these columns –

Summaries with audit()

There is an S3 method for audit(), so you can call audit() following duplicate_count_colpair(). It returns a tibble with summary statistics.

See Also

Examples

# Basic usage:
mtcars %>%
  duplicate_count_colpair()

# Summaries with `audit()`:
mtcars %>%
  duplicate_count_colpair() %>%
  audit()

[Package scrutiny version 0.4.0 Index]