ClussCluster {ClussCluster}R Documentation

Performs simultaneous detection of cell types and cell-type-specific signature genes

Description

ClussCluster takes the single-cell transcriptome data and returns an object containing cell types and type-specific signature gene sets

Selects the tuning parameter in a permutation approach. The tuning parameter controls the L1 bound on w, the feature weights.

Usage

ClussCluster(x, nclust = NULL, centers = NULL, ws = NULL,
  nepoch.max = 10, theta = NULL, seed = 1, nstart = 20,
  iter.max = 50, verbose = FALSE)

ClussCluster_Gap(x, nclust = NULL, B = 20, centers = NULL,
  ws = NULL, nepoch.max = 10, theta = NULL, seed = 1,
  nstart = 20, iter.max = 50, verbose = FALSE)

Arguments

x

An nxp data matrix. There are n cells and p genes.

nclust

Number of clusters desired if the cluster centers are not provided. If both are provided, nclust must equal the number of cluster centers.

centers

A set of initial (distinct) cluster centres if the number of clusters (nclust) is null. If both are provided, the number of cluster centres must equal nclust.

ws

One or multiple candidate tuning parameters to be evaluated and compared. Determines the sparsity of the selected genes. Should be greater than 1.

nepoch.max

The maximum number of epochs. In one epoch, each cell will be evaluated to determine if its label needs to be updated.

theta

Optional argument. If provided, theta are used as the initial cluster labels of the ClussCluster algorithm; if not, K-means is performed to produce starting cluster labels.

seed

This seed is used wherever K-means is used.

nstart

Argument passed to kmeans. It is the number of random sets used in kmeans.

iter.max

Argument passed to kmeans. The maximum number of iterations allowed.

verbose

Print the updates inside every epoch? If TRUE, the updates of cluster label and the value of objective function will be printed out.

B

Number of permutation samples.

Details

Takes the normalized and log transformed number of reads mapped to genes (e.g., log(RPKM+1) or log(TPM+1) where RPKM stands for Reads Per Kilobase of transcript per Million mapped reads and TPM stands for transcripts per million) but NOT centered.

Value

a list containing the optimal tuning parameter, s, group labels of clustering, theta, and type-specific weights of genes, w.

a list containig a vector of candidate tuning parameters, ws, the corresponding values of objective function, O, a matrix of values of objective function for each permuted data and tuning parameter, O_b, gap statistics and their one standard deviations, Gap and sd.Gap, the result given by ClussCluster, run, the tuning parameters with the largest Gap statistic and within one standard deviation of the largest Gap statistic, bestw and onesd.bestw

Examples

data(Hou_sim)
hou.dat <-Hou_sim$x
run.ft <- filter_gene(hou.dat)
hou.test <- ClussCluster(run.ft$dat.ft, nclust=3, ws=4, verbose = FALSE)

[Package ClussCluster version 0.1.0 Index]