R: corpus

corpus_ca {R.temis}

R Documentation

corpus_ca

Description

Run a correspondence analysis on a corpus.

Usage

corpus_ca(corpus, dtm, variables = NULL, ncp = 5, sparsity = 1, ...)

Arguments

`corpus`	A `Corpus` object.
`dtm`	A `DocumentTermMatrix` object corresponding to `corpus` with one row per document.
`variables`	An optional list of variables in `meta(corpus)` over which to aggregate `dtm`. If `NULL` (the default), the analysis is run on the unaggregated matrix.
`ncp`	The number of axes to compute (5 by default). Note that this determines the number of axes that will be used for clustering by `HCPC`. Pass `Inf` to compute all axes.
`sparsity`	Value between 0 and 1 indicating the proportion of documents with no occurrences of a term above which that term should be dropped. By default all terms are kept (`sparsity=1`).
`...`	Additional arguments passed to `FactoMineR::CA`.

Value

A CA object containing the correspondence analysis results.

Examples


file <- system.file("texts", "reut21578-factiva.xml", package="tm.plugin.factiva")
corpus <- import_corpus(file, "factiva", language="en")
dtm <- build_dtm(corpus)
corpus_ca(corpus, dtm, ncp=3, sparsity=0.98)

[Package R.temis version 0.1.3 Index]