showCluster {phm}R Documentation

Show Cluster Contents

Description

Show all documents and their non-zero terms in a cluster, with the terms first ordered by highest number of documents the term appears in, then total frequency.

Usage

showCluster(tdm, clust, cl, n = 10L)

Arguments

tdm

A term frequency matrix.

clust

A vector indicating for each column in tdm what cluster they belong to

cl

Cluster number

n

Integer showing the maximum number of terms to be returned (default 10)

Value

A matrix with document names of tdm on the columns and terms on the rows for all columns in the cluster, where terms that appear in the most documents (columns), and within that have the highest frequency in the cluster, are shown first. Two columns are added at the end of the matrix with the the number of documents each term appears in and its total frequency in the cluster. The number of terms displayed equals the number in n, or less if there are less terms in the cluster. If there are no terms at all in the cluster, a list is output with the items docs and note, where docs is a vector with all document names of documents in the cluster, and the note stating that the cluster has no terms.

Examples

M=matrix(c(0,1,0,2,0,10,0,14,12,0,8,0,1,0,1,0),4)
colnames(M)=1:4;rownames(M)=c("A","B","C","D")
tc=textCluster(M,2)
showCluster(M,tc$cluster,1)

[Package phm version 1.1.2 Index]