ngram {JATSdecoder} | R Documentation |
ngram
Description
Extracts ngram bag of words around words that match a search pattern. Note: If an input contains the search pattern twice, only the ngram bag of words of the last hit is detected. Consider individual text splitting with text2sentences() or strsplit2() before applying ngram().
Usage
ngram(
x,
pattern,
ngram = c(-3, 3),
tolower = FALSE,
split = FALSE,
exact = FALSE
)
Arguments
x |
vector of text strings to process. |
pattern |
a search term pattern to extract the ngram bag of words. |
ngram |
a vector of length=2 that defines the number of words to extract from left and right side of pattern match. |
tolower |
Logical. If TRUE converts text and pattern to lower case. |
split |
Logical. If TRUE splits text input at "[.,;:] " before processing. Note: You may consider other text splits before. |
exact |
Logical. If TRUE only exact word matches will be proceses |
Value
Character. Vector with +-n words of search pattern.
Examples
text<-"One hundred twenty-eight students participated in our Study,
that was administred in thirteen clinics."
ngram(text,pattern="study",ngram=c(-1,2))