subset_corpus {R.temis} | R Documentation |
subset_corpus
Description
Select documents containing (or not containing) one or more terms.
Usage
subset_corpus(corpus, dtm, terms, exclude = FALSE, all = FALSE)
Arguments
corpus |
A |
dtm |
A |
terms |
One of more terms appearing in |
exclude |
Whether documents containing the terms should be excluded rather than retained. |
all |
Whether only documents containing all terms should be retained or excluded. By default, documents need to contain at least one of the terms. |
Value
Corpus
object.
Examples
file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
corpus <- import_corpus(file, "factiva", language="en")
dtm <- build_dtm(corpus)
subset_corpus(corpus, dtm, "barrel")
subset_corpus(corpus, dtm, c("barrel", "opec"))
subset_corpus(corpus, dtm, c("barrel", "opec"), exclude=TRUE)
subset_corpus(corpus, dtm, c("barrel", "opec"), all=TRUE)
[Package R.temis version 0.1.3 Index]