dtm_colsums {udpipe} | R Documentation |
Column sums and Row sums for document term matrices
Description
Column sums and Row sums for document term matrices
Usage
dtm_colsums(dtm, groups)
dtm_rowsums(dtm, groups)
Arguments
dtm |
an object returned by |
groups |
optionally, a list with column/row names or column/row indexes of the |
Value
Returns either a vector in case argument groups
is not provided or a sparse matrix of class dgCMatrix
in case argument groups
is provided
in case
groups
is not provided: a vector of row/column sums with corresponding namesin case
groups
is provided: a sparse matrix containing summed information over the groups of rows/columns
Examples
x <- data.frame(
doc_id = c(1, 1, 2, 3, 4),
term = c("A", "C", "Z", "X", "G"),
freq = c(1, 5, 7, 10, 0))
dtm <- document_term_matrix(x)
x <- dtm_colsums(dtm)
x
x <- dtm_rowsums(dtm)
head(x)
##
## Grouped column summation
##
x <- list(doc1 = c("aa", "bb", "aa", "b"), doc2 = c("bb", "bb", "BB"))
dtm <- document_term_matrix(x)
dtm
dtm_colsums(dtm, groups = list(combinedB = c("b", "bb"), combinedA = c("aa", "A")))
dtm_colsums(dtm, groups = list(combinedA = c("aa", "A")))
dtm_colsums(dtm, groups = list(
combinedB = grep(pattern = "b", colnames(dtm), ignore.case = TRUE, value = TRUE),
combinedA = c("aa", "A", "ZZZ"),
test = character()))
dtm_colsums(dtm, groups = list())
##
## Grouped row summation
##
x <- list(doc1 = c("aa", "bb", "aa", "b"),
doc2 = c("bb", "bb", "BB"),
doc3 = c("bb", "bb", "BB"),
doc4 = c("bb", "bb", "BB", "b"))
dtm <- document_term_matrix(x)
dtm
dtm_rowsums(dtm, groups = list(doc1 = "doc1", combi = c("doc2", "doc3", "doc4")))
dtm_rowsums(dtm, groups = list(unknown = "docUnknown", combi = c("doc2", "doc3", "doc4")))
dtm_rowsums(dtm, groups = list())
[Package udpipe version 0.8.11 Index]