sdc_mpar {SparseMDC}R Documentation

SparseDC Multi Parallel

Description

Applies sparse clustering to data from multiple conditions, linking the clusters across conditions and selecting a set of marker variables for each cluster and condition. This is a wrapper function to run SparseMDC in parallel and choose the solution with the mimimum score for each run.

Usage

sdc_mpar(pdat, nclust, dim, lambda1, lambda2, nitter = 20,
  nstarts = 50, init_iter = 5, delta = 1e-07, par_starts, cores)

Arguments

pdat

list with D entries, each entry contains data d, p * n matrix. This data should be centered and log-transformed.

nclust

Number of clusters in the data.

dim

Total number of conditions, D.

lambda1

The lambda 1 value to use in the SparseMDC function. This value controls the number of marker genes detected for each of the clusters in the final result. This can be calculated using the "lambda1_calculator" function or supplied by the user.

lambda2

The lambda 2 value to use in the SparseMDC function. This value controls the number of genes that show condition-dependent expression within each cell type. This can be calculated using the "lambda2_calculator" function or supplied by the user.

nitter

The max number of iterations for each of the start values, the default value is 20.

nstarts

The max number of possible starts. The default value is 50.

init_iter

The number of iterations used to initialize the algorithm. Higher values result in less starts but more accurate and vice versa. Default is 5.

delta

Small term to ensure existance of solution, default is 0.0000001.

par_starts

Number of parallel starts.

cores

Number of cores to use.

Value

A list containing cluster assignments, center values and the scores for each start.

Examples

set.seed(10)
# Select small dataset for example
data_test <- data_biase[1:100,]
# Split data into condition A and B
data_A <- data_test[ , which(condition_biase == "A")]
data_B <- data_test[ , which(condition_biase == "B")]
data_C <- data_test[ , which(condition_biase == "C")]
# Store data as list
dat_l <- list(data_A, data_B, data_C)
# Pre-process the data
pdat <- pre_proc_data(dat_l, dim=3, norm = FALSE, log = TRUE,
center = TRUE)
# Calculate lambda1
lambda1 <- lambda1_calculator(pdat, dim = 3, nclust = 3)
# Calcualte lambda2
lambda2 <- lambda2_calculator(pdat, dim = 3, nclust = 3, lambda1 = lambda1)
# Prepare parallel enviornment
library(doParallel) # Load package
library(foreach)  # Load the package
library(doRNG)
# Apply SparseMDC
smdc_res <- sdc_mpar(pdat, nclust = 3, dim = 3, lambda1 = lambda1,
lambda2 = lambda2, par_starts = 2, cores = 2)

[Package SparseMDC version 0.99.5 Index]