comparedf.control {arsenal}R Documentation

Control settings for comparedf function

Description

Control tolerance definitions for the comparedf function.

Usage

comparedf.control(
  tol.logical = "none",
  tol.num = c("absolute", "percent", "pct"),
  tol.num.val = sqrt(.Machine$double.eps),
  int.as.num = FALSE,
  tol.char = c("none", "trim", "case", "both"),
  tol.factor = c("none", "levels", "labels"),
  factor.as.char = FALSE,
  tol.date = "absolute",
  tol.date.val = 0,
  tol.other = "none",
  tol.vars = "none",
  max.print.vars = NA,
  max.print.obs = NA,
  max.print.diffs.per.var = 10,
  max.print.diffs = 50,
  max.print.attrs = NA,
  ...,
  max.print.diff = 10
)

Arguments

tol.logical, tol.num, tol.char, tol.factor, tol.date, tol.other

A function or one of the shortcut character strings or a list thereof, denoting the tolerance function to use for a given data type. See "details", below.

tol.num.val

Numeric; maximum value of differences allowed in numerics (fed to the function given in tol.num).

int.as.num

Logical; should integers be coerced to numeric before comparison? Default FALSE.

factor.as.char

Logical; should factors be coerced to character before comparison? Default FALSE.

tol.date.val

Numeric; maximum value of differences allowed in dates (fed to the function given in tol.date).

tol.vars

Either "none" (the default), denoting that variable names are to be matched as-is, a named vector manually specifying variable names to compare (where the names correspond to columns of x and the values correspond to columns of y), or a character vector denoting equivalence classes for characters in the variable names. See "details", below.

max.print.vars

Integer denoting maximum number of variables to report in the "variables not shared" and "variables not compared" output. NA will print all differences.

max.print.obs

Integer denoting maximum number of not-shared observations to report. NA will print all differences.

max.print.diffs.per.var, max.print.diffs

Integers denoting the maximum number of differences to report for each variable or overall. NA will print all differences for each variable or overall.

max.print.attrs

Integers denoting the maximum number of non-identical attributes to report.NA will print all differences.

...

Other arguments (not in use at this time).

max.print.diff

Deprecated.

Details

The following character strings are accepted:

A list with names mapped to x can be used to specify tolerances by variable. One unnamed element is supported as the default.

tol.vars: If not set to "none" (the default) or a named vector, the tol.vars argument is a character vector denoting equivalence classes for the characters in the variable names. A single character in this vector means to replace that character with "". All other strings in this vector are split by character and replaced by the first character in the string.

E.g., a character vector c("._", "aA", " ") would denote that the dot and underscore are equivalent (to be translated to a dot), that "a" and "A" are equivalent (to be translated to "a"), and that spaces should be removed.

The special character string "case" in this vector is the same as specifying paste0(letters, LETTERS).

Value

A list containing the necessary parameters for the comparedf function.

Author(s)

Ethan Heinzen

See Also

comparedf, comparedf.tolerances, summary.comparedf

Examples

cntl <- comparedf.control(
  tol.num = "pct",     # calculate percent differences
  tol.vars = c("case", # ignore case
               "._",   # set all underscores to dots.
               "e")    # remove all letter e's
)

cntl <- comparedf.control(tol.char = list(
  "none",      # the default
  x1 = "case", # be case-insensitive for the variable "x1"
  x2 = function(x, y) tol.NA(x, y, x != y | y == "NA") # a custom-defined tolerance
))

[Package arsenal version 3.6.3 Index]