dpmd {PoissonMultinomial}R Documentation

Probability Mass Function of Poisson-Multinomial Distribution

Description

Computes the pmf of Poisson-Multinomial distribution (PMD), specified by the success probability matrix, using various methods. This function is capable of computing all probability mass points as well as of pmf at certain point(s).

Usage

dpmd(pmat, xmat = NULL, method = "DFT-CF", B = 1000)

Arguments

pmat

An \rm n \times m success probability matrix. Here, \rm n is the number of independent trials, and \rm m is the number of categories. Each row of pmat describes the success probability for the corresponding trial and it should add up to 1.

xmat

A matrix with \rm m columns that specifies where the pmf is to be computed. Each row of the matrix should has the form \rm x = (x_{1}, \ldots, x_{m}) which is used for computing \rm P(X_{1}=x_{1}, \ldots, X_{m} = x_{m}), the values of \rm x should sum up to \rm n. It can be a vector of length \rm m. If xmat is NULL, the pmf at all probability mass points will be computed.

method

Character string stands for the method selected by users to compute the cdf. The method can only be one of the following three: "DFT-CF", "NA", "SIM".

B

Number of repeats used in the simulation method. It is ignored for methods other than the "SIM" method.

Details

Consider \rm n independent trials and each trial leads to a success outcome for exactly one of the \rm m categories. Each category has varying success probabilities from different trials. The Poisson multinomial distribution (PMD) gives the probability of any particular combination of numbers of successes for the \rm m categories. The success probabilities form an \rm n \times m matrix, which is called the success probability matrix and denoted by pmat. For the methods we applied in dpmd, "DFT-CF" is an exact method that computes all probability mass points of the distribution, using multi-dimensional FFT algorithm. When the dimension of pmat increases, the computation burden of "DFT-CF" may challenge the capability of a computer because the method automatically computes all probability mass points regardless of the input of xmat.

"SIM" is a simulation method that generates random samples from the distribution, and uses relative frequency to estimate the pmf. Note that the accuracy and running time will be affected by user choice of B. Usually B=1e5 or 1e6 will be accurate enough. Increasing B to larger than 1e8 will heavily increase the computational burden of the computer.

"NA" is an approximation method that uses a multivariate normal distribution to approximate the pmf at the points specified in xmat. This method requires an input of xmat.

Notice if xmat is not specified then it will be set as NULL. In this case, dpmd will compute the entire pmf if the chosen method is "DFT-CF" or "SIM". If xmat is provided, only the pmf at the points specified by xmat will be outputted.

Value

For a given xmat, dpmd returns the pmf at points specified by xmat.

If xmat is NULL, all probability mass points for the distribution specified by the success probability matrix pmat will be computed, and the results are stored and outputted in a multi-dimensional array, denoted by res. Note the dimension of pmat is \rm n \times m, thus res will be an \rm (n+1)^{(m-1)} array. Then the value of the pmf \rm P(X_{1}=x_{1}, \ldots, X_{m} = x_{m}) can be extracted as \rm res[x_{1}+1, \ldots, x_{m-1}+1].

For example, for the pmat matrix in the example section, the array element res[1,2,1]=0.90 gives the value of the pmf \rm P(X_{1}=0, X_{2}=1, X_{3}=0, X_{4}=2)=0.90.

References

Lin, Z., Wang, Y., and Hong, Y. (2023). The computing of the Poisson multinomial distribution and applications in ecological inference and machine learning, Computational Statistics, Vol. 38, pp. 1851-1877.

Examples

pp <- matrix(c(.1, .1, .1, .7, .1, .3, .3, .3, .5, .2, .1, .2), nrow = 3, byrow = TRUE)
x <- c(0,0,1,2) 
x1 <- matrix(c(0,0,1,2,2,1,0,0),nrow=2,byrow=TRUE)

dpmd(pmat = pp)
dpmd(pmat = pp, xmat = x1)
dpmd(pmat = pp, xmat = x)

dpmd(pmat = pp, xmat = x, method = "NA" )
dpmd(pmat = pp, xmat = x1, method = "NA" )

dpmd(pmat = pp, method = "SIM", B = 1e3)
dpmd(pmat = pp, xmat = x, method = "SIM", B = 1e3)
dpmd(pmat = pp, xmat = x1, method = "SIM", B = 1e3)


[Package PoissonMultinomial version 1.1 Index]