R: Rewrite Terms and Frequencies into Many Files

m2doc {chinese.misc}

R Documentation

Rewrite Terms and Frequencies into Many Files

Description

Given a matrix representing a document term matrix, this function takes each row as term frequencies for one file, and rewrite each row as a text. Some text mining tools other than R accept segmented Chinese texts. If you already convert texts into a matrix, you can use this function to convert it into texts, corpus or create document term matrix again.

Usage

m2doc(m, checks = FALSE)

Arguments

`m`	a numeric matrix, data frame is not allowed. It must represent a document term matrix, rather than a term document matrix. Each row of the matrix represents a text. The matrix should have column names as terms to be written, but if it is `NULL`, the function will take them as "term1", "term2", "term3", ...No `NA` in the matrix is allowed.
`checks`	should be `TRUE` or `FALSE`. If it is TRUE, the function will check whether there is any `NA` in the input, whether it is numeric, and whether there is any negative number. Default is `FALSE` to save time.

Value

a character vector, each element is a text with repeated terms (by rep) linked by a space.

Examples

s <- sample(1:5, 20, replace = TRUE)
m <- matrix(s, nrow = 5)
colnames(m) <- c("r", "text", "mining", "data")
m2doc(m)

[Package chinese.misc version 0.2.3 Index]