probaPost {HTSCluster} | R Documentation |
Calculate the conditional probability of belonging to each cluster in a Poisson mixture model
Description
This function computes the conditional probabilities t_{ik}
that an observation i arises from the k^{\mathrm{th}}
component for the current value of the mixture parameters.
Usage
probaPost(y, g, conds, pi, s, lambda)
Arguments
y |
(n x q) matrix of observed counts for n observations and q variables |
g |
Number of clusters |
conds |
Vector of length q defining the condition (treatment group) for each variable (column) in |
pi |
Vector of length |
s |
Vector of length q containing the estimates for the normalized library size parameters for each of the q variables in |
lambda |
(d x |
Value
t |
(n x |
Note
If all values of t_{ik}
are 0 (or nearly zero), the observation is assigned with probability one to belong to the cluster with the closest mean (in terms of the Euclidean distance from the observation). To avoid calculation difficulties, extreme values of t_{ik}
are smoothed, such that those smaller than 1e-10 or larger than 1-1e-10 are set equal to 1e-10 and 1-1e-10, respectively.
Author(s)
Andrea Rau
References
Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux G. (2015). Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9):1420-1427.
Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C. (2011). Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at https://inria.hal.science/inria-00638082.
See Also
PoisMixClus
for Poisson mixture model estimation and model selection;
PoisMixMean
to calculate the conditional per-cluster mean of each observation
Examples
set.seed(12345)
## Simulate data as shown in Rau et al. (2011)
## Library size setting "A", high cluster separation
## n = 200 observations
simulate <- PoisMixSim(n = 200, libsize = "A", separation = "high")
y <- simulate$y
conds <- simulate$conditions
s <- colSums(y) / sum(y) ## TC estimate of lib size
## Run the PMM-II model for g = 3
## "TC" library size estimate, EM algorithm
run <- PoisMixClus(y, g = 3, norm = "TC",
conds = conds)
pi.est <- run$pi
lambda.est <- run$lambda
## Calculate the conditional probability of belonging to each cluster
proba <- probaPost(y, g = 3, conds = conds, pi = pi.est, s = s,
lambda = lambda.est)
## head(round(proba,2))