top_supervised {AutoPipe}R Documentation

A Function for Assisting Supervised Clustering

Description

when perfoming a supervised clustering the user should run this function in order to get the best results.

Usage

top_supervised(me,TOP=1000,cluster_which,TRw=-1)

Arguments

me

the matrix of the gene exporessions, the olums should be the samples and the colnames the sample names the rownames should be the genes . at best the ENTEREZID

TOP

the top genes to choose, default is 100.

cluster_which

a dataframe with the supervised clustering arrangment of the samples. the dataframe should have the sample names in the first column and the clustering in the secound column.

TRw

the threshhold for excluding samples with silhouette width < TRw

Value

a list. the first place is the expression matrix, the secound is the silhouette for each sample.

Examples



library(org.Hs.eg.db)
data(rna)
cluster_which<-cbind(colnames(rna),c(rep(1,times=24),rep(2,times=24)))
me_x=rna
## calculate best number of clusters and
res<-top_supervised(me_x,TOP = 100,cluster_which)
me_TOP=res[[1]]
number_of_k=2
groups_men=res[[2]]
me_x=me_TOP
colnames(me_x)
o_g<-Supervised_Cluster_Heatmap(groups_men = groups_men, gene_matrix=me_x,
                               method="PAMR",show_sil=TRUE,print_genes=TRUE,threshold = 0,
                               TOP = 100,GSE=FALSE,plot_mean_sil=FALSE,stats_clust=res[[2]],
                               samples_data = as.data.frame(groups_men[,1,drop=FALSE]))
                               

[Package AutoPipe version 0.1.6 Index]