preprocess {sbo} | R Documentation |
Preprocess text corpus
Description
A simple text preprocessing utility.
Usage
preprocess(input, erase = "[^.?!:;'\\w\\s]", lower_case = TRUE)
Arguments
input |
a character vector. |
erase |
a length one character vector. Regular expression matching parts of text to be erased from input. The default removes anything not alphanumeric, white space, apostrophes or punctuation characters (i.e. ".?!:;"). |
lower_case |
a length one logical vector. If TRUE, puts everything to lower case. |
Value
a character vector containing the processed output.
Author(s)
Valerio Gherardi
Examples
preprocess("Hi @ there! I'm using `sbo`.")
[Package sbo version 0.5.0 Index]