post_analy_clus {twl}R Documentation

Assigns cluster labels by building dendrogram and thresholding at specified height

Description

Assigns cluster labels by building dendrogram and thresholding at specified height

Usage

post_analy_clus(outpu_new, clus_sav_new, num_clusts, height_clusts_vec = NULL,
  titles, pdf_path)

Arguments

outpu_new

the output of the pairwise_clus function, and a list whose length is the number of datasets being integrated, and each elemnt of which is a posterior similarity matrix. The dimension of each symmetric matrix is the number of samples in the respective dataset, and elements in the matrix are values between 0 and 1, and estimate of the probability 2 samples find themselves in the same clustering.

clus_sav_new

list of samples outputted from TWLsample function. See details for additional explanation of this parameter and height_clusts_vec.

num_clusts

a vector of length the number of integrated datasets, specifying the number of cluster labels to be identified from the generated dendrogram for each dataset

height_clusts_vec

vector of dendrogram heights of length the number of integrated datasets (if the analyst prefers manual inspection of outputted dendrograms and specification of the heights at which to threshold, thereby defining cluster membership). Defaults to NULL. See details for additional explanation of this parameter and num_clusts.

titles

Vector of strings of length the number of datasets, used as prefixes in column labels of the outputted list of data.tables.

pdf_path

file path where the dendrogram figures will be saved as a pdf.

Details

At least one of either num_clusts or height_clusts_vec, or both, can be specified. If both are specified, then heights is first used within the dendrogram for preliminary cluster assignment, then the X largest clusters of these receive final, outputted, assignment (the rest receiving a "clus_unknown" label), where X is the corresponding element in the num_clusts argument vector.

Value

post_lab a list of data.tables of 2 columns each with names 'nam' and '*_clus', the nam specifying sample name annotation, and *_clus with the assigned cluster, where * is the corresponding element in the title argument vector.

Examples

data(data_and_output)
## Not run: clus_save <- TWLsample(misaligned_mat,misaligned,output_every=50,num_its=5000,manip=FALSE)
outpu_new <- pairwise_clus(clus_save,BURNIN=2000)
post_analy_cor(outpu_new,c("title1","title2","title3","title4","title5"),
tempfile(),ords='none') 
clus_labs <- post_analy_clus(outpu_new,clus_save,c(2:6),rep(0.6,5),c("title1","title2",
"title3","title4","title5"),tempfile())
output_nest <- cross_dat_analy(clus_save,4750)

## End(Not run)

[Package twl version 1.0 Index]