read.corp.custom,kRp.corpus-method {tm.plugin.koRpus} | R Documentation |
Apply read.corp.custom() to all texts in kRp.corpus objects
Description
This method calls read.corp.custom
on all tagged text objects
inside the given corpus
object.
Usage
## S4 method for signature 'kRp.corpus'
read.corp.custom(corpus, caseSens = TRUE, log.base = 10,
keep_dtm = FALSE, ...)
Arguments
corpus |
An object of class |
caseSens |
Logical. If |
log.base |
A numeric value defining the base of the logarithm used for inverse document frequency (idf). See
|
keep_dtm |
Logical. If |
... |
Options to pass through to the |
Details
Since the analysis is based on a document term matrix,
a pre-existing matrix as a feature of the corpus
object
will be used if it matches the case sensitivity setting. Otherwise a new matrix will be generated (but not replace the
existing one). If no document term matrix is present yet,
also one will be generated and can be kept as an additional feature
of the resulting object.
Value
An object of the same class as corpus
.
Examples
# use readCorpus() to create an object of class kRp.corpus
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
myCorpus <- readCorpus(
dir=file.path(
path.package("tm.plugin.koRpus"), "examples", "corpus", "Edwards"
),
hierarchy=list(
Source=c(
Wikipedia_prev="Wikipedia (old)",
Wikipedia_new="Wikipedia (new)"
)
),
# use tokenize() so examples run without a TreeTagger installation
tagger="tokenize",
lang="en"
)
myCorpus <- read.corp.custom(myCorpus)
corpusCorpFreq(myCorpus)
} else {}