| tfidf {textir} | R Documentation |
tf-idf
Description
term frequency, inverse document frequency
Usage
tfidf(x,normalize=TRUE)
Arguments
x |
A |
normalize |
Whether to normalize term frequency by document totals. |
Value
A matrix of the same type as x, with values replaced by the tf-idf
f_{ij} * \log[n/(d_j+1)],
where f_{ij} is x_{ij}/m_i or x_{ij}, depending on normalize,
and d_j is the number of documents containing token j.
Author(s)
Matt Taddy taddy@chicagobooth.edu
See Also
pls, we8there
Examples
data(we8there)
## 20 high-variance tf-idf terms
colnames(we8thereCounts)[
order(-sdev(tfidf(we8thereCounts)))[1:20]]
[Package textir version 2.0-5 Index]