R: Remove word classes

filterByClass {koRpus}

R Documentation

Remove word classes

Description

This method strips off defined word classes of tagged text objects.

Usage

filterByClass(txt, ...)

## S4 method for signature 'kRp.text'
filterByClass(
  txt,
  corp.rm.class = "nonpunct",
  corp.rm.tag = c(),
  as.vector = FALSE,
  update.desc = TRUE
)

Arguments

`txt`	An object of class `kRp.text`.
`...`	Additional options, currently unused.
`corp.rm.class`	A character vector with word classes which should be removed. The default value `"nonpunct"` has special meaning and will cause the result of `kRp.POS.tags(lang, tags=c("punct","sentc"), list.classes=TRUE)` to be used. Another valid value is "stopword" to remove all detected stopwords.
`corp.rm.tag`	A character vector with valid POS tags which should be removed.
`as.vector`	Logical. If `TRUE`, results will be returned as a character vector containing only the text parts which survived the filtering.
`update.desc`	Logical. If `TRUE`, the `desc` slot of the tagged object will be fully recalculated using the filtered text. If `FALSE`, the `desc` slot will be copied from the original object. Finally, if `NULL`, the `desc` slot remains empty.

Value

An object of the input class. If as.vector=TRUE, returns only a character vector.

Examples

# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  filterByClass(tokenized.obj)
} else {}

[Package koRpus version 0.13-8 Index]

Remove word classes

Description

Usage

Arguments

Value

See Also

Examples