remove.stopwords {morestopwords}R Documentation

Removes stop words for a string the language of which is known

Description

Removes stop words for a string the language of which is known

Usage

remove.stopwords(str, lang = "auto", fallback = "English")

Arguments

str

A string or a vector of strings which to delete the stop words from

lang

Either:

  • 'auto' in which case cld2 is used to perform language detection; or

  • A string (or a vector of strings, depending on str) representing an ISO 639-2/3 or a language name from which to derive a ISO 639-2 code (for language names, string matching is performed)

fallback

Fallback language in case cld2 fails to detect the language of the manually-specified string does not match a supported language. Default to 'English'.

Value

A strings (or a vector, depending on str) corresponding to the string/s str without stop words for the language/s lang.

Examples

# Multiple strings in different languages
remove.stopwords(str = c(Gibberish = 'dadas',
                         Catalan = 'Adeu amic meu',
                         Irish = 'Slan a chara',
                         French = 'Je suis en Allemagne',
                         German = 'Eich liebe Deutschland'),
                 # Various ways of indicating the language
                 lang = c(NA, 'cata', 'Iris', 'fr', 'deu'),
                 # Yet another way
                 fallback = 'english'
                 )


[Package morestopwords version 0.2.0 Index]