sub_holder {textclean}R Documentation

Hold the Place of Characters Prior to Subbing

Description

This function holds the place for particular character values, allowing the user to manipulate the vector and then revert the place holders back to the original values.

Usage

sub_holder(x, pattern, alpha.type = TRUE, holder.prefix = "zzzplaceholder",
  holder.suffix = "zzz", ...)

Arguments

x

A character vector.

pattern

Character string to be matched in the given character vector.

alpha.type

logical. If TRUE alpha (lower case letters) are used for the key. If FALSE numbers are used as the key.

holder.prefix

The prefix to use before the alpha key in the palce holder when alpha.type = TRUE; this ensures uniqueness.

holder.suffix

The suffix to use after the alpha key in the palce holder when alpha.type = TRUE; this ensures uniqueness.

...

Additional arguments passed to gsub.

Value

Returns a list with the following:

output

keyed place holder character vector

unhold

A function used to revert back to the original values

Note

The unhold function for sub_holder will only work on keys that have not been disturbed by subsequent alterations. The key follows the pattern of holder.prefix ('zzzplaceholder') followed by lower case letter keys followed by holder.suffix ('zzz') when alpha.type = TRUE, otherwise the holder is numeric.

Examples

## `alpha.type` as TRUE
library(lexicon); library(textshape)
(fake_dat <- paste(hash_emoticons[1:11, 1, with=FALSE][[1]], DATA$state))
(m <- sub_holder(fake_dat, hash_emoticons[[1]]))
m$unhold(strip(m$output))

## `alpha.type` as FALSE (numeric keys)
vowels <- LETTERS[c(1, 5, 9, 15, 21)]
(m2 <- sub_holder(toupper(DATA$state), vowels, alpha.type = FALSE))
m2$unhold(gsub("[^0-9]", "", m2$output))
mtabulate(strsplit(m2$unhold(gsub("[^0-9]", "", m2$output)), ""))

[Package textclean version 0.9.3 Index]