TermDocumentMatrix.DCorpus {tm.plugin.dc}R Documentation

Term-Document Matrix from Distributed Corpora

Description

Constructs a term-document matrix given a distributed corpus.

Usage

## S3 method for class 'DCorpus'
TermDocumentMatrix(x, control = list())

Arguments

x

A distributed corpus.

control

A named list of control options. The component weighting must be a weighting function capable of handling a TermDocumentMatrix. It defaults to weightTf for term frequency weighting. All other options are delegated internally to a termFreq call.

Value

An object of class TermDocumentMatrix containing a sparse term-document matrix. The attribute Weighting contains the weighting applied to the matrix.

See Also

The documentation of termFreq gives an extensive list of possible options.

TermDocumentMatrix

Examples

data("crude")
tdm <- TermDocumentMatrix(as.DCorpus(crude),
                          list(stopwords = TRUE, weighting = weightTfIdf))
inspect(tdm[149:152,1:5])

[Package tm.plugin.dc version 0.2-10 Index]