R: Fit an empirical Bayes prior to the data

empiricalBayesPrior {clusternomics}

R Documentation

Fit an empirical Bayes prior to the data

Description

Fit an empirical Bayes prior to the data

Usage

empiricalBayesPrior(datasets, distributions = "diagNormal",
  globalConcentration = 0.1, localConcentration = 0.1, type = "fitRate")

Arguments

`datasets`	List of data matrices where each matrix represents a context-specific dataset. Each data matrix has the size N times M, where N is the number of data points and M is the dimensionality of the data. The full list of matrices has length C. The number of data points N must be the same for all data matrices.
`distributions`	Distribution of data in each dataset. Can be either a list of length C where `dataDistributions[c]` is the distribution of dataset c, or a single string when all datasets have the same distribution. Currently implemented distribution is the `'diagNormal'` option for multivariate Normal distribution with diagonal covariance matrix.
`globalConcentration`	Prior concentration parameter for the global clusters. Small values of this parameter give larger prior probability to smaller number of clusters.
`localConcentration`	Prior concentration parameter for the local context-specific clusters. Small values of this parameter give larger prior probability to smaller number of clusters.
`type`	Type of prior that is fitted to the data. The algorithm can fit either rate of the prior covariance matrix, or fit the full covariance matrix to the data.

Value

Returns the prior object that can be used as an input for the contextCluster function.

Examples

# Example with simulated data (see vignette for details)
nContexts <- 2
# Number of elements in each cluster
groupCounts <- c(50, 10, 40, 60)
# Centers of clusters
means <- c(-1.5,1.5)
testData <- generateTestData_2D(groupCounts, means)
datasets <- testData$data

# Generate the prior
fullDataDistributions <- rep('diagNormal', nContexts)
prior <- empiricalBayesPrior(datasets, fullDataDistributions, 0.01, 0.1, 'fitRate')

# Fit the model
# 1. specify number of clusters
clusterCounts <- list(global=10, context=c(3,3))
# 2. Run inference
# Number of iterations is just for demonstration purposes, use
# a larger number of iterations in practice!
results <- contextCluster(datasets, clusterCounts,
     maxIter = 10, burnin = 5, lag = 1,
     dataDistributions = 'diagNormal', prior = prior,
     verbose = TRUE)

[Package clusternomics version 0.1.1 Index]