R: Redact columns from a dataframe with the default redactors

redact_columns {dittodb}

R Documentation

Redact columns from a dataframe with the default redactors

Description

This function redacts the columns specified in columns in the data given in data using dittodb's standard redactors.

Usage

redact_columns(data, columns, ignore.case = TRUE, ...)

Arguments

`data`	a dataframe to redact
`columns`	character, the columns to redact
`ignore.case`	should case be ignored? (default: `TRUE`)
`...`	additional options to pass on to `grep()` when matching the column names

Details

The column names given in the columns argument are treated as regular expressions, however they always have ^ and $ added to the beginning and end of the strings. So if you would like to match any column that starts with the string sensitive (e.g. sensitive_name, sensitive_date) you could use ⁠"sensitive.*⁠ and this would catch all of those columns (though it would not catch a column called most_sensitive_name).

The standard redactors replace all values in the column with the following values based on the columns type:

integer – 9L
numeric – 9
character – "[redacted]"
POSIXct (date times) – as.POSIXct("1988-10-11T17:00:00", tz = tzone)

Value

data, with the columns specified in columns duly redacted

Examples

if (check_for_pkg("nycflights13", message)) {
  small_flights <- head(nycflights13::flights)

  # with no columns specified, redacting does nothing
  redact_columns(small_flights, columns = NULL)

  # integer
  redact_columns(small_flights, columns = c("arr_time"))

  # numeric
  redact_columns(small_flights, columns = c("arr_delay"))

  # characters
  redact_columns(small_flights, columns = c("origin", "dest"))

  # datetiems
  redact_columns(small_flights, columns = c("time_hour"))
}

[Package dittodb version 0.1.8 Index]