create_ttm {chinese.misc}R Documentation

Create Term-Term Matrix (Term-Cooccurrence Matrix)

Description

This is a convenient function to create term-term matrix from document-term matrix, term-document matrix, or a matrix that represents one of the two. Sparse matrix is used to speed up computing. The output can be either a matrix or a sparse matrix.

Usage

create_ttm(x, type = "dtm", tomatrix = FALSE, checks = TRUE)

Arguments

x

an object of class DocumentTermMatrix or TermDocumentMatrix, or a matrix which has its rownames or colnames as terms.

type

if x is a matrix, this argument tells whether it is a DTM or a TDM; for the former, a character starting with "D/d", and for the latter, starting with "T/t".

tomatrix

should be logical, whether to output a matrix result. If TRUE, a matrix representing a TTM is returned. If FALSE (default), a list is returned: the first element is a sparse matrix created by package Matrix, with no words, the second element is a character vector of these words.

checks

if x is a matrix, whether to check its validity, that is, whether it is numeric, all values are 0 or positive, there is no NA.

Examples

x <- c(
  "Hello, what do you want to drink?", 
  "drink a bottle of milk", 
  "drink a cup of coffee", 
  "drink some water")
dtm <- corp_or_dtm(x, from = "v", type = "dtm")
ttm1 <- create_ttm(dtm)
ttm2 <- create_ttm(dtm, tomatrix = TRUE)
tdm <- t(dtm)
ttm3 <- create_ttm(tdm)
ttm_sparse <- ttm3[[1]]
ttm_ordinary <- as.matrix(ttm_sparse)
colnames(ttm_ordinary) <- ttm3[[2]]
rownames(ttm_ordinary) <- ttm3[[2]]
# You can also use Matrix::writeMM(ttm_sparse, filename) 
# to write it on your disk.

[Package chinese.misc version 0.2.3 Index]