removeStopWords {arabicStemR} | R Documentation |
Remove Arabic stopwords.
Description
Defines a list of Arabic-language stopwords and removes them from a string.
Usage
removeStopWords(texts, defaultStopwordList=TRUE, customStopwordList=NULL)
Arguments
texts |
A string from which Arabic stopwords should be removed. |
defaultStopwordList |
If TRUE, use the default stopword list of words to be removed. If FALSE, do not use the default stopword list. Default is TRUE. |
customStopwordList |
Optional user-specified stopword list of words to be removed, supplied as a vector of strings in either Arabic UTF-8 or Latin characters following the stemmer's transliteration scheme (words without Arabic UTF-8 characters are processed with reverse.transliterate()). Default is NULL. |
Value
Returns a string with Arabic stopwords removed.
Author(s)
Rich Nielsen
Examples
## Create string with Arabic characters
x <- '\u0627\u0647\u0644\u0627 \u0648\u0633\u0647\u0644\u0627
\u064a\u0627 \u0635\u062f\u064a\u0642\u064a'
## Remove stop words
removeStopWords(x)$text
## Not run
## To see the full list of stop words
removeStopWords(x)$arabicStopwordList
[Package arabicStemR version 1.3 Index]