dfba_beta_contrast {DFBA}R Documentation

Bayesian Contrasts


This function implements a Bayesian analysis of a linear contrast of conditions when there are 2 or more independent conditions and where the variate for each condition is a binomial.


  a0_vec = rep(1, length(n1_vec)),
  b0_vec = rep(1, length(n1_vec)),
  prob_interval = 0.95,
  samples = 10000



A vector of length K that consists of the observed number of successes for the categorical variable in each of the K separate conditions


A vector of length K that consists of the observed number of failures for the categorical variable in each of the K separate conditions


A vector of coefficients of a linear comparison among the conditions where the sum of all the coefficients must be 0 and the sum of the positive coefficients must be 1 and the sum of the negative coefficients must be -1


A vector of length K that consists of the prior a0 shape parameters for the separate betas (the default values are 1)


A vector of length K that consists of the prior b0 shape parameters for the separate betas (the default values are 1)


Desired probability for equal-tail interval estimate on the contrast (default is 0.95)


The desired number of Monte Carlo samples taken from each posterior beta variate (default is 10000)


Since the Bayesian analysis for each separate condition has a posterior beta distribution with known shape parameters, the program approximates, via Monte Carlo sampling, a linear contrast among the set of independent beta distributions because the contrast of beta distributions is not a known probability model.

Given a binomial categorical variate for each of KK independent conditions with K2K \ge 2, the standard frequentist nonparametric analysis is to do a χ2\chi^2 test with K1K - 1 degrees of freedom (Siegel & Castellan, 1988). Hypothesis testing for the frequentist χ2\chi^2 test assesses the sharp-null hypothesis that the binomial success rate is exactly equal in all the conditions. But this point-null hypothesis is not an interesting question about the population success rate from a Bayesian viewpoint because the probability of any single point hypothesis has a probability measure value of zero (Chechile, 2020). Although it is possible that the frequentist null hypothesis can be retained for small-nn studies, the hypothesis itself is about the population in the case of unlimited sample size, and surely for this limiting case it is almost certain that the hypothesis is not exactly true. Thus, from the Bayesian framework, the point- null hypothesis is not a good use of scientific effort and resources, and it is more scientifically meaningful to assess a linear comparison of the conditions, such as to assess if the population success rate in one condition is greater than the success rate in another condition. An interval hypothesis such as this has a meaningful probability value, as does the complimentary hypothesis. If ϕ1\phi_1 and ϕ2\phi_2 are, respectively, the population success rates for the binomials in conditions 1 and 2, then a meaningful comparison might be to assess the probability distribution for Δ=\Delta = ϕ2ϕ1\phi_2 - \phi_1. This example is a simple linear contrast with contrast coefficient weights of -1 and 1, which are the multipliers for the two population success rates. If the posterior interval estimate for the contrast contains 0, then the hypothesis of Δ=0\Delta = 0 has some credibility in light of the given sample size. Thus, by estimating the distribution of Δ\Delta, the user learns important information about condition differences. As another example of a contrast, suppose there are three conditions where the first condition is a standard control and the other two conditions are different alternative conditions. In this case, a user might want to compare the mean of the control data against the average of the two experimental- condition means, i.e., the contrast of

Δ=1ϕ1+.5ϕ2+.5ϕ3.\Delta = -1\phi_1 +.5\phi_2 + .5\phi_3.

In this second example, the coefficients of the contrast are [1,+.5,+.5][-1, +.5, +.5]. As a third example, the user might also be interested in a comparison where the two experimental conditions are compared, i.e., the contrast of

Δ=0ϕ1+1ϕ21ϕ3.\Delta = 0\phi_1 + 1\phi_2 - 1\phi_3.

For the dfba_beta_contrast() function, the user is required to stipulate the coefficients of a contrast such that the sum of all the coefficients is 0, the sum of the positive coefficients is 1, and the sum of the negative coefficients is -1. This constraint on the coefficients forces Δ\Delta to be on the [1,+1][-1, +1] interval.

There is a standard Bayesian posterior for each condition, which is a beta distribution (see Chechile (2020) for a detailed discussion of this literature). In short, it is well known that the beta distribution is a natural Bayesian conjugate function for Bernoulli random processes. Thus, a prior beta distribution with shape parameters a0a_0 and b0b_0 results (via Bayes's theorem) in a posterior beta with shape parameters aa and bb where a=a0+n1a = a_0 + n_1 and b=b0+n2b = b_0 + n_2, where n1n_1 and n2n_2 are the respective successes and failures of the categorical variable. While the Bayesian analysis of each beta distribution for the separate conditions are known, a comparison among 2 or more separate beta distributions is not distributed as a beta. The posterior mean of a linear contrast of separate beta variates has a known mean regardless of the correlations among the variates, but the distributional form of the contrast of independent betas is not known in closed form. The distributional form is important for ascertaining issues such as determining the probability that the contrast is positive or specifying a probability interval for the contrast. But, with the dfba_beta_contrast() function, these important aspects of the Bayesian analysis are approximated via Monte Carlo simulation.

The samples argument stipulates the number of random values to be drawn from each of the KK posterior conditions. The default value for samples is 10000. The default value of 10000 is also the minimum value that can be selected (increased values of samples provide increased precision). Posterior interval estimation and the Bayes factor for the contrast are provided on the basis of the Monte Carlo sampling. If samples is equal to NN and if ϕ1,,ϕK\phi_1, \ldots, \phi_K are the parameters for the population success rates, then there are NN random values drawn from each of ϕi\phi_i parameters for i=1,,Ki = 1, \ldots , K. Given the contrast coefficients stipulated in the arguments, there are NN delta random posterior values where Δj=Ψ1ϕ1j++ΨiϕKj\Delta_j = \Psi_1\phi_{1j}+ \ldots +\Psi_i\phi_{Kj} for j=1,,Nj = 1, \ldots, N, where Ψi\Psi_i are the contrast coefficients specified in the contrast_vec argument. The Monte Carlo sampling from each posterior beta with known shape parameters uses the rbeta() function. Thus, unlike Bayesian procedures that employ Markov chain Monte Carlo algorithms, the Monte Carlo sampling in the dfba_beta_contrast() function does not depend on a burn-in process or a starting estimate. Thus, all the NN sampled values are valid random samples. Repeated use of the dfba_beta_contrast() function for the same input will naturally exhibit some random variation in the interval estimate and in the Bayes factor for a contrast greater than 0. However, the point estimate for the contrast does not depend on the Monte Carlo sampling, and it is constant given the vectors for n1_vec and n2_vec and given the same prior.


A list containing the following components:


Exact posterior mean estimate for the contrast


The lower equal-tail limit for the contrast for the probability interval value specified by prob_interval


The upper equal-tail limit for the contrast for the probability interval value specified by prob_interval


Posterior probability that the contrast is positive


Prior probability that the contrast is positive


The Bayes factor for the posterior-to-prior odds for a positive contrast to a non-positive contrast


Quantile values (probs = seq(0, 1, 0.01)) for the posterior contrast from the Monte Carlo sampling


A vector of length K that consists of the posterior a shape parameters for the separate posterior beta distributions


A vector of length K that consists of the posterior b shape parameters for the separate posterior beta distributions


A vector of length K that consists of the prior a0 shape parameters for the separate prior beta distributions


A vector of length K that consists of the prior b0 shape parameters for the separate prior beta distributions


A vector for the contrast coefficients for a linear comparison of posterior beta variates


The probability for the equal-tail estimate for the contrast (default is 0.95)


The number of Monte Carlo samples from the K separate posterior beta distributions


Chechile, R. A. (2020). Bayesian Statistics for Experimental Scientists: A General Introduction Using Distribution-Free Methods. Cambridge: MIT Press.

Siegel, S. & Castellan, N. J. (1988). Nonparametric Statistics for the Behavioral Sciences. New York: McGraw Hill.


## Suppose there are four conditions from a factorial design
# where the conditions labels are A1B1, A2B1, A1B2, and A2B2
# where the frequencies for success for the binomial variate are:
n1_vec <- c(22, 15, 13, 21)
# and the frequencies for failures per condition are:
n2_vec <- c(18, 25, 27, 19)
# Let us test the following three orthogonal contrasts
contrast.B1vsB2 <- c(.5, .5, -.5, -.5)
contrast.A1vsA2 <- c(.5, -.5, .5, -.5)
contrast.ABinter <- c(.5, -.5, -.5, .5)

dfba_beta_contrast(n1_vec = n1_vec,
                   n2_vec = n2_vec,
                   contrast_vec = contrast.B1vsB2)

                   contrast_vec = contrast.A1vsA2)

                   contrast_vec = contrast.ABinter)

# Plot the cumulative distribution for AB interaction
                                      contrast_vec = contrast.ABinter)

[Package DFBA version 0.1.0 Index]