u_cluster_similarity {eclust}  R Documentation 
Return cluster membership of each predictor. This function is
called internally by the s_generate_data
and
s_generate_data_mars
functions. Is also used by the
r_clust
function for real data analysis.
u_cluster_similarity(x, expr, exprTest, distanceMethod,
clustMethod = c("hclust", "protoclust"), cutMethod = c("dynamic", "gap",
"fixed"), nClusters, method = c("complete", "average", "ward.D2", "single",
"ward.D", "mcquitty", "median", "centroid"), K.max = 10, B = 50, nPC,
minimum_cluster_size = 50)
x 
similarity matrix. must have nonNULL dimnames i.e., the rows and columns should be labelled, e.g. "Gene1, Gene2, ..." 
expr 
gene expression data (training set). rows are people, columns are genes 
exprTest 
gene expression test set. If using real data, and you dont
have enough samples for a test set then just supply the same data supplied
to the 
distanceMethod 
one of "euclidean","maximum","manhattan", "canberra",
"binary","minkowski" to be passed to 
clustMethod 
Cluster the data using hierarchical clustering or
prototype clustering. Defaults 
cutMethod 
what method to use to cut the dendrogram. 
nClusters 
number of clusters. Only used if 
method 
the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC). 
K.max 
the maximum number of clusters to consider, must be at least
two. Only used if 
B 
integer, number of Monte Carlo (“bootstrap”) samples. Only used if

nPC 
number of principal components. Can be 1 or 2. 
minimum_cluster_size 
The minimum cluster size. Only applicable if

a list of length 2:
a p x 3 data.frame or data.table which give the cluster membership of each gene, where p is the number of genes. The first column is the gene name, the second column is the cluster number (numeric) and the third column is the cluster membership as a character vector of color names (these will match up exactly with the cluster number)
a list of length 9:
a list of the eigengenes i.e. the 1st (and 2nd if nPC=2) principal component of each module
a data.frame of the average expression for each module for the training set
a data.frame of the average expression for each module for the test set
percentage of variance explained by each 1st (and 2nd if nPC=2) principal component of each module
cluster membership of each gene
a data.frame of the 1st (and 2nd if nPC=2) PC for each module for the training set
a data.frame of the 1st (and 2nd if nPC=2) PC for each module for the test set
the prcomp
object
a numeric value for the total number of clusters
data("simdata")
X = simdata[,c(1,2)]
train_index < sample(1:nrow(simdata),100)
cluster_results < u_cluster_similarity(x = cor(X),
expr = X[train_index,],
exprTest = X[train_index,],
distanceMethod = "euclidean",
clustMethod = "hclust",
cutMethod = "dynamic",
method = "average", nPC = 2,
minimum_cluster_size = 75)
cluster_results$clusters[, table(module)]
names(cluster_results$pcInfo)
cluster_results$pcInfo$nclusters