rgdirmn {MGLM}R Documentation

The Generalized Dirichlet Multinomial Distribution

Description

rgdirmn generates random observations from the generalized Dirichlet multinomial distribution. dgdirmn computes the log of the generalized Dirichlet multinomial probability mass function.

Usage

rgdirmn(n, size, alpha, beta)

dgdirmn(Y, alpha, beta)

Arguments

n

the number of random vectors to generate. When size is a scalar and alpha is a vector, must specify n. When size is a vector and alpha is a matrix, n is optional. The default value of n is the length of size. If given, n should be equal to the length of size.

size

a number or vector specifying the total number of objects that are put into d categories in the generalized Dirichlet multinomial distribution.

alpha

the parameter of the generalized Dirichlet multinomial distribution. alpha is a numerical positive vector or matrix.

For gdirmn, alpha should match the size of Y. If alpha is a vector, it will be replicated nn times to match the dimension of Y.

For rdirmn, if alpha is a vector, size must be a scalar. All the random vectors will be drawn from the same alpha and size. If alpha is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of alpha and the corresponding element of size.

beta

the parameter of the generalized Dirichlet multinomial distribution. beta should have the same dimension as alpha.

For rdirm, if beta is a vector, size must be a scalar. All the random samples will be drawn from the same beta and size. If beta is a matrix, the number of rows should match the length of size. Each random vector will be drawn from the corresponding row of beta and the corresponding element of size.

Y

the multivariate count matrix with dimensions n×dn \times d, where n=1,2,n = 1,2, \ldots is the number of observations and d=3,4,d=3,4,\ldots is the number of categories.

Details

Y=(y1,,yd)Y=(y_1, \ldots, y_d) are the dd category count vectors. Given the parameter vector α=(α1,,αd1),αj>0\alpha = (\alpha_1, \ldots, \alpha_{d-1}), \alpha_j>0, and β=(β1,,βd1),βj>0\beta=(\beta_1, \ldots, \beta_{d-1}), \beta_j>0, the generalized Dirichlet multinomial probability mass function is

P(yα,β)=Cy1,,ydmj=1d1Γ(αj+yj)Γ(αj)Γ(βj+zj+1)Γ(βj)Γ(αj+βj)Γ(αj+βj+zj), P(y|\alpha,\beta) =C_{y_1, \ldots, y_d}^{m} \prod_{j=1}^{d-1} \frac{\Gamma(\alpha_j+y_j)}{\Gamma(\alpha_j)} \frac{\Gamma(\beta_j+z_{j+1})}{\Gamma(\beta_j)} \frac{\Gamma(\alpha_j+\beta_j)}{\Gamma(\alpha_j+\beta_j+z_j)} ,

where zj=k=jdykz_j = \sum_{k=j}^d y_k and m=j=1dyjm = \sum_{j=1}^d y_j. Here, CknC_k^n, often read as "nn choose kk", refers the number of kk combinations from a set of nn elements.

The α\alpha and β\beta parameters can be vectors, like the results from the distribution fitting function, or they can be matrices with nn rows, like the estimate from the regression function multiplied by the covariate matrix exp(Xα)exp(X\alpha) and exp(Xβ)exp(X\beta)

Value

dgdirmn returns the value of log(P(yα,β))\log(P(y|\alpha, \beta)). When Y is a matrix of nn rows, the function dgdirmn returns a vector of length nn.

rgdirmn returns a n×dn\times d matrix of the generated random observations.

Examples

# example 1
m <- 20
alpha <- c(0.2, 0.5)
beta <- c(0.7, 0.4)
Y <- rgdirmn(10, m, alpha, beta)
dgdirmn(Y, alpha, beta)

# example 2 
set.seed(100)
alpha <- matrix(abs(rnorm(40)), 10, 4)
beta <- matrix(abs(rnorm(40)), 10, 4)
size <- rbinom(10, 10, 0.5)
GDM.rdm <- rgdirmn(size=size, alpha=alpha, beta=beta)
GDM.rdm1 <- rgdirmn(n=20, size=10, alpha=abs(rnorm(4)), beta=abs(rnorm(4)))

[Package MGLM version 0.2.1 Index]