ccml {ccml} | R Documentation |
A two-step consensus clustering inputing multiple predictive labels with different sample coverages (missing labels)
Description
A two-step consensus clustering inputing multiple predictive labels with different sample coverages (missing labels)
Usage
ccml(
title,
label,
output = "rdata",
nperm = 10,
ncore = 1,
seedn = 100,
stability = TRUE,
maxK = 15,
reps = 1000,
pItem = 0.9,
plot = NULL,
clusterAlg = "spectralClusteringAffinity",
innerLinkage = "complete",
...
)
Arguments
title |
A character value for output directory. Directory is created only if not existed. This title can be an abosulte or relative path. Input for |
label |
A matrix or data frame of input labels or a character value of input file name, columns=different clustering results and rows are samples. |
output |
A character value for output format, or "rdata"(default) as save to .rdata when both output and plot are not NULL, others will return to workspace. |
nperm |
A integer value of the permutation numbers, or nperm=10(default), which means |
ncore |
A integer value of cores to use, or ncore=1 (default). It's the input core numbers for the parallel computation in this package |
seedn |
A numerical value to set the start random seed for reproducible results, or seedn=100 (default). For every 1000 iteration, the seed will +1 to get repeat results. Input for |
stability |
A logical value. Should estimate the stability of normalized consensus weight based on permutation numbers (default stability=TRUE), or not? Input for |
maxK |
integer value. maximum cluster number to evaluate. Input for |
reps |
integer value. number of subsamples. Input for |
pItem |
numerical value. proportion of items to sample. Input for |
plot |
character value. NULL(default) - print to screen, 'pdf', 'png', 'pngBMP' for bitmap png, helpful for large datasets. Input for |
clusterAlg |
character value. cluster algorithm. 'spectralClusteringAffinity' for spectral clustering of similarity/affinity matrix(default), other methods for clustering of distance matrix, 'hc' heirarchical (hclust), 'pam' for paritioning around medoids,
'km' for k-means upon data matrix, 'kmdist' for k-means upon distance matrices (former km option), or a function that returns a clustering. Input for |
innerLinkage |
heirarchical linkage method for subsampling, or "complete"(default). Input for |
... |
Other input arguments for |
Value
A list of three items
ncw - A matrix of normalized consensus weights. Output from
callNCW
.fcluster - A list of length maxK. Each element is a list containing consensusMatrix (numerical matrix), consensusTree (hclust), consensusClass (consensus class asssignments). ConsensusClusterPlus also produces images. Output from
ConsensusClusterPlus::ConsensusClusterPlus
icl a list of two elements clusterConsensus and itemConsensus corresponding to cluster-consensus and item-consensus. Output from
ConsensusClusterPlus::ConsensusClusterPlus
Examples
# load data
data(example_data)
label=example_data
# if plot is not NULL, results will be saved in "result_output" directory
title="result_output"
# not estimate stability of permutation numbers.
res_1=ccml(title=title,label=label,nperm = 3,ncore=1,stability=FALSE,maxK=5,pItem=0.8)
# other methods for clustering of distance matrix
res_2<-ccml(title=title,label=label,nperm = 10,ncore=1,stability=TRUE,maxK=3,
pItem=0.9,clusterAlg = "hc")
# set the start random seed
res_3<-ccml(title=title,label=label,output=FALSE,nperm = 5,ncore=1,seedn=150,stability=TRUE,maxK=3,
pItem=0.9)