R: The Generalized Dirichlet Multinomial Distribution

rgdirmn {MGLM}

R Documentation

The Generalized Dirichlet Multinomial Distribution

Description

rgdirmn generates random observations from the generalized Dirichlet multinomial distribution. dgdirmn computes the log of the generalized Dirichlet multinomial probability mass function.

Usage

rgdirmn(n, size, alpha, beta)

dgdirmn(Y, alpha, beta)

Arguments

`n`	the number of random vectors to generate. When `size` is a scalar and `alpha` is a vector, must specify `n`. When `size` is a vector and `alpha` is a matrix, `n` is optional. The default value of `n` is the length of `size`. If given, `n` should be equal to the length of `size`.
`size`	a number or vector specifying the total number of objects that are put into d categories in the generalized Dirichlet multinomial distribution.
`alpha`	the parameter of the generalized Dirichlet multinomial distribution. `alpha` is a numerical positive vector or matrix. For `gdirmn`, `alpha` should match the size of `Y`. If `alpha` is a vector, it will be replicated `n` times to match the dimension of `Y`. For `rdirmn`, if `alpha` is a vector, `size` must be a scalar. All the random vectors will be drawn from the same `alpha` and `size`. If `alpha` is a matrix, the number of rows should match the length of `size`. Each random vector will be drawn from the corresponding row of `alpha` and the corresponding element of `size`.
`beta`	the parameter of the generalized Dirichlet multinomial distribution. `beta` should have the same dimension as `alpha`. For `rdirm`, if `beta` is a vector, `size` must be a scalar. All the random samples will be drawn from the same `beta` and `size`. If `beta` is a matrix, the number of rows should match the length of `size`. Each random vector will be drawn from the corresponding row of `beta` and the corresponding element of `size`.
`Y`	the multivariate count matrix with dimensions `n \times d`, where `n = 1,2, \ldots` is the number of observations and `d=3,4,\ldots` is the number of categories.

Details

Y=(y_1, \ldots, y_d) are the d category count vectors. Given the parameter vector \alpha = (\alpha_1, \ldots, \alpha_{d-1}), \alpha_j>0, and \beta=(\beta_1, \ldots, \beta_{d-1}), \beta_j>0, the generalized Dirichlet multinomial probability mass function is

P(y|\alpha,\beta) =C_{y_1, \ldots, y_d}^{m} \prod_{j=1}^{d-1} \frac{\Gamma(\alpha_j+y_j)}{\Gamma(\alpha_j)} \frac{\Gamma(\beta_j+z_{j+1})}{\Gamma(\beta_j)} \frac{\Gamma(\alpha_j+\beta_j)}{\Gamma(\alpha_j+\beta_j+z_j)} ,

where z_j = \sum_{k=j}^d y_k and m = \sum_{j=1}^d y_j. Here, C_k^n, often read as "n choose k", refers the number of k combinations from a set of n elements.

The \alpha and \beta parameters can be vectors, like the results from the distribution fitting function, or they can be matrices with n rows, like the estimate from the regression function multiplied by the covariate matrix exp(X\alpha) and exp(X\beta)

Value

dgdirmn returns the value of \log(P(y|\alpha, \beta)). When Y is a matrix of n rows, the function dgdirmn returns a vector of length n.

rgdirmn returns a n\times d matrix of the generated random observations.

Examples

# example 1
m <- 20
alpha <- c(0.2, 0.5)
beta <- c(0.7, 0.4)
Y <- rgdirmn(10, m, alpha, beta)
dgdirmn(Y, alpha, beta)

# example 2 
set.seed(100)
alpha <- matrix(abs(rnorm(40)), 10, 4)
beta <- matrix(abs(rnorm(40)), 10, 4)
size <- rbinom(10, 10, 0.5)
GDM.rdm <- rgdirmn(size=size, alpha=alpha, beta=beta)
GDM.rdm1 <- rgdirmn(n=20, size=10, alpha=abs(rnorm(4)), beta=abs(rnorm(4)))

[Package MGLM version 0.2.1 Index]