getReadMatrix.NB {HeritSeq}R Documentation

Simulate a count matrix from negative binomial mixed effect models (NBMM).

Description

Simulate a (possibly unbalanced) count matrix from NBMM. Under NBMM, an observed number of reads aligned to feature/gene gg, YgsrY_{gsr}, follows a negative binomial (NB) distribution with mean μgs\mu_{gs} and variance μgs+ϕgμgs2\mu_{gs}+\phi_{g} \mu_{gs}^2, where ϕg\phi_g is the dispersion parameter, shared across strains. The generalized linear model uses a log\log-link:
log(μgs)=αg+bgs,    bgsN(0,σg2).\log(\mu_{gs}) = \alpha_g+ b_{gs}, \;\;b_{gs}\sim N(0, \sigma^2_g).

Usage

getReadMatrix.NB(vec.num.rep, alphas, sigma2s, phis)

Arguments

vec.num.rep

A vector of replicate numbers for each strain.

alphas

Intercept vector αg\alpha_g's, 1×num.features1 \times \texttt{num.features}.

sigma2s

Random effect variance vector σg2\sigma^2_g's, 1×num.features1 \times \texttt{num.features}.

phis

Dispersion parameter in NB models, ϕg\phi_g's, a 1×num.features1 \times \texttt{num.features} vector.

Value

A G×NG \times N matrix with NB reads. NN is the total number of samples; GG is the number of features. Column names are sample names of the form "Ss_r", where S stands for sample, s is the strain number, r is the replicate number within the strain. Row names are the feature names of the form "Gene g", where g is the feature index.

Examples

## Generate a sequencing dataset with 5 features and 6 strains. 
## Assign parameter values.
rep.num <- c(3, 5, 2, 3, 4, 2)
a0s <- c(-1, 1, 2, 5, 10)
sig2s <- c(10, 0.2, 0.1, 0.03, 0.01)
phis <- c(0.5, 1, 0.05, 0.01, 0.1)

set.seed(1234)
## Generate reads:
nbData <- getReadMatrix.NB(rep.num, a0s, sig2s, phis)

[Package HeritSeq version 1.0.2 Index]