dpmd {PoissonMultinomial} | R Documentation |
Probability Mass Function of Poisson-Multinomial Distribution
Description
Computes the pmf of Poisson-Multinomial distribution (PMD), specified by the success probability matrix, using various methods. This function is capable of computing all probability mass points as well as of pmf at certain point(s).
Usage
dpmd(pmat, xmat = NULL, method = "DFT-CF", B = 1000)
Arguments
pmat |
An |
xmat |
A matrix with |
method |
Character string stands for the method selected by users to
compute the cdf. The method can only be one of
the following three:
|
B |
Number of repeats used in the simulation method. It is ignored for methods other than
the |
Details
Consider \rm n
independent trials and each trial leads to a success outcome for exactly one of the \rm m
categories.
Each category has varying success probabilities from different trials. The Poisson multinomial distribution (PMD) gives the probability
of any particular combination of numbers of successes for the \rm m
categories.
The success probabilities form an \rm n \times m
matrix, which is called the success probability matrix and denoted by pmat
.
For the methods we applied in dpmd
, "DFT-CF"
is an exact method that computes all probability mass points of the distribution,
using multi-dimensional FFT algorithm. When the dimension of pmat
increases, the computation burden of "DFT-CF"
may challenge the capability
of a computer because the method automatically computes all probability mass points regardless of the input of xmat
.
"SIM"
is a simulation method that generates random samples from the distribution, and uses relative frequency to estimate the pmf. Note that the accuracy and running time will be affected by user choice of B
.
Usually B
=1e5 or 1e6 will be accurate enough. Increasing B
to larger than 1e8 will heavily increase the
computational burden of the computer.
"NA"
is an approximation method that uses a multivariate normal distribution to approximate
the pmf at the points specified in xmat
. This method requires an input of xmat
.
Notice if xmat
is not specified then it will be set as NULL
. In this case, dpmd
will
compute the entire pmf if the chosen method is "DFT-CF"
or "SIM"
.
If xmat
is provided, only the pmf at the points specified
by xmat
will be outputted.
Value
For a given xmat
, dpmd
returns the pmf at points specified by xmat
.
If xmat
is NULL
, all probability mass points for the distribution specified by the success probability matrix pmat
will be computed, and the results are
stored and outputted in a multi-dimensional array, denoted by res
. Note the dimension of
pmat
is \rm n \times m
, thus res
will be an \rm (n+1)^{(m-1)}
array. Then
the value of the pmf \rm P(X_{1}=x_{1}, \ldots, X_{m} = x_{m})
can be extracted as \rm res[x_{1}+1, \ldots, x_{m-1}+1]
.
For example, for the pmat
matrix in the example section, the array element res[1,2,1]=0.90
gives
the value of the pmf \rm P(X_{1}=0, X_{2}=1, X_{3}=0, X_{4}=2)=0.90
.
References
Lin, Z., Wang, Y., and Hong, Y. (2023). The computing of the Poisson multinomial distribution and applications in ecological inference and machine learning, Computational Statistics, Vol. 38, pp. 1851-1877.
Examples
pp <- matrix(c(.1, .1, .1, .7, .1, .3, .3, .3, .5, .2, .1, .2), nrow = 3, byrow = TRUE)
x <- c(0,0,1,2)
x1 <- matrix(c(0,0,1,2,2,1,0,0),nrow=2,byrow=TRUE)
dpmd(pmat = pp)
dpmd(pmat = pp, xmat = x1)
dpmd(pmat = pp, xmat = x)
dpmd(pmat = pp, xmat = x, method = "NA" )
dpmd(pmat = pp, xmat = x1, method = "NA" )
dpmd(pmat = pp, method = "SIM", B = 1e3)
dpmd(pmat = pp, xmat = x, method = "SIM", B = 1e3)
dpmd(pmat = pp, xmat = x1, method = "SIM", B = 1e3)