R: Simulating a multivariate Bernoulli distribution

RMultBinary {mipfp}

R Documentation

Simulating a multivariate Bernoulli distribution

Description

This function generates a sample from a multinomial distribution of K dependent binary (Bernoulli) variables (X_1, X_2, ..., X_K) defined by an array (of 2^K cells) detailing the joint-probabilities.

Usage

RMultBinary(n = 1, mult.bin.dist, target.values = NULL)

Arguments

`n`	Desired sample size. Default = 1.
`mult.bin.dist`	A list describing the multivariate binary distribution. It can be generated by the `ObtainMultBinaryDist` function. The list contains at least the element `joint.proba`, an array detailing the joint-probabilities of the `K` binary variables. The array has `K` dimensions of size 2, referring to the 2 possible outcomes of the considered variable. Hence, the total number of elements is `2^K`. Additionnaly the list can also provides the element `var.label`, a list containing the names of the `K` variables.
`target.values`	A list describing the possibles outcomes of each binary variable, for instance {1, 2}. Default = {0, 1}.

Value

A list whose elements are detailed herehunder.

`binary.sequences`	The generated `K \times n` random sequence.
`possible.binary.sequences`	The possible binary sequences, i.e. the domain.
`chosen.random.index`	The index of the random draws in the domain.

Author(s)

Thomas Suesse

Maintainer: Johan Barthelemy <johan@uow.edu.au>.

References

Lee, A.J. (1993). Generating Random Binary Deviates Having Fixed Marginal Distributions and Specified Degrees of Association. The American Statistician 47 (3): 209-215.

Qaqish, B. F., Zink, R. C., and Preisser, J. S. (2012). Orthogonalized residuals for estimation of marginally specified association parameters in multivariate binary data. Scandinavian Journal of Statistics 39, 515-527.

Examples

# from Qaqish et al. (2012)
or <- matrix(c(Inf, 0.281, 2.214, 2.214,
               0.281, Inf, 2.214, 2.214,
               2.214, 2.214, Inf, 2.185,
               2.214, 2.214, 2.185, Inf), nrow = 4, ncol = 4, byrow = TRUE)
rownames(or) <- colnames(or) <- c("Parent1", "Parent2", "Sibling1", "Sibling2")

# hypothetical marginal probabilities
p <- c(0.2, 0.4, 0.6, 0.8)

# estimating the joint-distribution
p.joint <- ObtainMultBinaryDist(odds = or, marg.probs = p)

# simulating 100,000 draws from the obtained joint-distribution
y.sim <- RMultBinary(n = 1e5, mult.bin.dist = p.joint)$binary.sequences

# checking results
cat('dim y.sim =', dim(y.sim)[1], 'x', dim(y.sim)[2], '\n')
cat('Estimated marginal probs from simulated data\n')
apply(y.sim,2,mean)
cat('True probabilities\n')
print(p)
cat('Estimated correlation from simulated data\n')
cor(y.sim)
cat('True correlation\n')
Odds2Corr(or,p)$corr

# generating binary outcomes with outcome different than 0, 1
RMultBinary(n = 10, mult.bin.dist = p.joint, 
            target.values = list(c("A", "B"), c(0, 1), c(1, 2), c(100, 101)))

[Package mipfp version 3.2.1 Index]