redcap_column_sanitize {REDCapR}R Documentation

Sanitize to adhere to REDCap character encoding requirements

Description

Replace non-ASCII characters with legal characters that won't cause problems when writing to a REDCap project.

Usage

redcap_column_sanitize(
  d,
  column_names = colnames(d),
  encoding_initial = "latin1",
  substitution_character = "?"
)

Arguments

d

The base::data.frame() containing the dataset used to update the REDCap project. Required.

column_names

An array of character values indicating the names of the variables to sanitize. Optional.

encoding_initial

An array of character values indicating the names of the variables to sanitize. Optional.

substitution_character

The character value that replaces characters that were unable to be appropriately matched.

Details

Letters like an accented 'A' are replaced with a plain 'A'.

This is a thin wrapper around base::iconv(). The ⁠ASCII//TRANSLIT⁠ option does the actual transliteration work. As of ⁠R 3.1.0⁠, the OSes use similar, but different, versions to convert the characters. Be aware of this in case you notice OS-dependent differences.

Value

A base::data.frame() with same columns, but whose character values have been sanitized.

Author(s)

Will Beasley

Examples

# Typical examples are not shown because they require non-ASCII encoding,
#   which makes the package documentation less portable.

dirty <- data.frame(
  id     = 1:3,
  names  = c("Ekstr\xf8m", "J\xf6reskog", "bi\xdfchen Z\xfcrcher")
)

REDCapR::redcap_column_sanitize(dirty)

[Package REDCapR version 1.1.0 Index]