cleanAbstracts {PubMedWordcloud} | R Documentation |
clean data
Description
remove Punctuations, remove Numbers, Translate characters to lower or upper case, remove stopwords, remove user specified words, Stemming words.
Usage
cleanAbstracts(abstracts, rmNum = TRUE, tolw = TRUE, toup = FALSE,
rmWords = TRUE, yrWords = NULL, stemDoc = FALSE)
Arguments
abstracts |
output of getAbstracts, or just a paragraph of text |
rmNum |
Remove the text document with any numbers in it or not |
tolw |
Translate characters in character vectors to lower case or not |
toup |
Translate characters in character vectors to upper case or not |
rmWords |
Remove a set of English stopwords (e.g., 'the') or not |
yrWords |
A character vector listing the words to be removed. |
stemDoc |
Stem words in a text document using Porter's stemming algorithm. |
See Also
Examples
# Abs=getAbstracts(c("22693232", "22564732"))
# cleanAbs=cleanAbstracts(Abs)
# text="Jobs received a number of honors and public recognition."
# cleanD=cleanAbstracts(text)
[Package PubMedWordcloud version 0.3.6 Index]