ei.MD.bayes {eiPack} | R Documentation |
Multinomial Dirichlet model for Ecological Inference in RxC tables
Description
Implements a version of the hierarchical model suggested in Rosen et al. (2001)
Usage
ei.MD.bayes(formula, covariate = NULL, total = NULL, data,
lambda1 = 4, lambda2 = 2, covariate.prior.list = NULL,
tune.list = NULL, start.list = NULL, sample = 1000, thin = 1,
burnin = 1000, verbose = 0, ret.beta = 'r',
ret.mcmc = TRUE, usrfun = NULL)
Arguments
formula |
A formula of the form |
covariate |
An optional formula of the form |
total |
if row and/or column marginals are given as proportions,
|
data |
A data frame containing the variables specified in
|
lambda1 |
The shape parameter for the gamma prior (defaults to 4) |
lambda2 |
The rate parameter for the gamma prior (defaults to 2) |
covariate.prior.list |
a list containing the parameters for normal prior distributions on delta and gamma for model with covariate. See ‘details’ for more information. |
tune.list |
A list containing tuning parameters for each block of
parameters. See ‘details’ for more information. Typically, this
will be a list generated by |
start.list |
A list containing starting values for each block of
parameters. See ‘details’ for more information. The default is
|
sample |
Number of draws to be saved from chain
and returned as output from the function (defaults to 1000). The total
length of the chain is |
thin |
an integer specifying the thinning interval for posterior draws (defaults to 1, but most problems will require a much larger thinning interval). |
burnin |
integer specifying the number of initial iterations to be discarded (defaults to 1000, but most problems will require a longer burnin). |
verbose |
an integer specifying whether the progress of the sampler
is printed to the screen (defaults to 0). If |
ret.beta |
A character indicating how the posterior draws of beta should be
handled: ' |
ret.mcmc |
A logical value indicating how the samples from the posterior
should be returned. If |
usrfun |
the name of an optional a user-defined function to obtain quantities of
interest while drawing from the MCMC chain (defaults to |
Details
ei.MD.bayes
implements a version of the hierarchical
Multinomial-Dirichlet model for ecological inference in R
\times C
tables suggested by Rosen et al. (2001).
Let r = 1, \ldots, R
index rows, C = 1,
\ldots, C
index columns, and i = 1, \ldots,
n
index units. Let N_{\cdot ci}
be the
marginal count for column c
in unit i
and X_{ri}
be the
marginal proportion for row r
in unit i
. Finally, let
\beta_{rci}
be the proportion of row r
in column c
for unit i
.
The first stage of the model assumes that the vector of column
marginal counts in unit i
follows a Multinomial distribution of the
form:
(N_{\cdot 1i}, \ldots, N_{\cdot Ci}) {\sim}
{\rm Multinomial}(N_i,\sum_{r=1}^R \beta_{r1i}X_{ri}, \dots,
\sum_{r=1}^R \beta_{rCi}X_{ri})
The second stage of the model assumes that the vector of
\beta
for row r
in unit i
follows a Dirichlet
distribution with C
parameters. The model may be fit with or
without a covariate.
If the model is fit without a covariate, the distribution of the vector
\beta_{ri}
is :
(\beta_{r1i}, \dots, \beta_{rCi}) {\sim} {\rm
Dirichlet}(\alpha_{r1}, \dots, \alpha_{rC})
In this case, the prior on each \alpha_{rc}
is assumed
to be:
\alpha_{rc} \sim {\rm Gamma}(\lambda_1, \lambda_2)
If the model is fit with a covariate, the distribution of the vector
\beta_{ri}
is :
(\beta_{r1i}, \dots, \beta_{rCi}) {\sim} {\rm
Dirichlet}(d_r\exp(\gamma_{r1} + \delta_{r1}Z_i),
d_r\exp(\gamma_{r(C-1)} + \delta_{r(C-1)}Z_i), d_r)
The parameters \gamma_{rC}
and
\delta_{rC}
are constrained to be zero for
identification. (In this function, the last column entered in the
formula is so constrained.)
Finally, the prior for d_r
is:
d_r \sim {\rm Gamma}(\lambda_1, \lambda_2)
while \gamma_{rC}
and \delta_{rC}
are
given improper uniform priors if covariate.prior.list = NULL
or
have independent normal priors of the form:
\delta_{rC} \sim {\rm N}(\mu_{\delta_{rC}},
\sigma_{\delta_{rC}}^2)
\gamma_{rC} \sim {\rm N}(\mu_{\gamma_{rC}},
\sigma_{\gamma_{rC}}^2)
If the user wishes to estimate the model with proper normal priors on
\gamma_{rC}
and \delta_{rC}
, a list
with four elements must be provided for covariate.prior.list
:
mu.delta
anR \times (C-1)
matrix of prior means for Deltasigma.delta
anR \times (C-1)
matrix of prior standard deviations for Deltamu.gamma
anR \times (C-1)
matrix of prior means for Gammasigma.gamma
anR \times (C-1)
matrix of prior standard deviations for Gamma
Applying the model without a covariate is most reasonable in situations where one can think of individuals being randomly assigned to units, so that there are no aggregation or contextual effects. When this assumption is not reasonable, including an appropriate covariate may improve inferences; note, however, that there is typically little information in the data about the relationship of any given covariate to the unit parameters, which can lead to extremely slow mixing of the MCMC chains and difficulty in assessing convergence.
Because the conditional distributions are non-standard, draws from the
posterior are obtained by using a Metropolis-within-Gibbs algorithm.
The proposal density for each parameter is a univariate normal
distribution centered at the current parameter value with standard
deviation equal to the tuning constant; the only exception is for
draws of \gamma_{rc}
and \delta_{rc}
, which
use a bivariate normal proposal with covariance zero.
The function will accept user-specified starting values as an argument. If the model includes a covariate, the starting values must be a list with the following elements, in this order:
start.dr
a vector of lengthR
of starting values for Dr. Starting values for Dr must be greater than zero.start.betas
anR \times C
by precincts array of starting values for Beta. Each row of every precinct must sum to 1.start.gamma
anR \times C
matrix of starting values for Gamma. Values in the right-most column must be zero.start.delta
anR \times C
matrix of starting values for Delta. Values in the right-most column must be zero.
If there is no covariate, the starting values must be a list with the following elements:
start.alphas
anR \times C
matrix of starting values for Alpha. Starting values for Alpha must be greater than zero.start.betas
anR \times C \times
units array of starting values for Beta. Each row in every unit must sum to 1.
The function will accept user-specified tuning parameters as an argument. The tuning parameters define the standard deviation of the normal distribution used to generate candidate values for each parameter. For the model with a covariate, a bivariate normal distribution is used to generate proposals; the covariance of these normal distributions is fixed at zero. If the model includes a covariate, the tuning parameters must be a list with the following elements, in this order:
tune.dr
a vector of lengthR
of tuning parameters for Drtune.beta
anR \times (C-1)
by precincts array of tuning parameters for Betatune.gamma
anR \times (C-1)
matrix of tuning parameters for Gammatune.delta
anR \times (C-1)
matrix of tuning parameters for Delta
If there is no covariate, the tuning parameters are a list with the following elements:
tune.alpha
anR \times C
matrix of tuning parameters for Alphatune.beta
anR \times (C-1)
by precincts array of tuning parameters for Beta
Value
A list containing
draws |
A list containing samples from the posterior distribution of the parameters. If a covariate is included in the model, the list contains:
If the model is fit without a covariate, the list includes:
|
acc.ratios |
A list containing acceptance ratios for the parameters. If the model includes a covariate, the list includes:
If the model is fit without a covariate , the list includes:
|
usrfun |
Output from the optional |
call |
Call to |
Author(s)
Michael Kellermann <mrkellermann@gmail.com> and Olivia Lau <olivia.lau@post.harvard.edu>
References
Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2002. Output Analysis and Diagnostics for MCMC (CODA). https://CRAN.R-project.org/package=coda.
Ori Rosen, Wenxin Jiang, Gary King, and Martin A. Tanner.
2001. “Bayesian and Frequentist Inference for Ecological
Inference: The R \times (C-1)
Case.”
Statistica Neerlandica 55: 134-156.
See Also
lambda.MD
, cover.plot
,
density.plot
, tuneMD
,
mergeMD