R: Data Augmentation algorithm for incomplete categorical data

da.cat {cat}

R Documentation

Data Augmentation algorithm for incomplete categorical data

Description

Markov-Chain Monte Carlo method for simulating draws from the observed-data posterior distribution of underlying cell probabilities under a saturated multinomial model. May be used in conjunction with imp.cat to create proper multiple imputations.

Usage

da.cat(s, start, prior=0.5, steps=1, showits=FALSE)

Arguments

`s`	summary list of an incomplete categorical dataset created by the function `prelim.cat`.
`start`	starting value of the parameter. This is an array of cell probabilities of dimension `s$d`, such as one created by `em.cat`. If structural zeros appear in the table, starting values for those cells should be zero.
`prior`	optional array of hyperparameters specifying a Dirichlet prior distribution. The default is the Jeffreys prior (all hyperparameters = supplied with hyperparameters set to `NA` for those cells.
`steps`	number of data augmentation steps to be taken. Each step consists of an imputation or I-step followed by a posterior or P-step.
`showits`	if `TRUE`, reports the iterations so the user can monitor the progress of the algorithm.

Details

At each step, the missing data are randomly imputed under their predictive distribution given the observed data and the current value of theta (I-step), and then a new value of theta is drawn from its Dirichlet posterior distribution given the complete data (P-step). After a suitable number of steps are taken, the resulting value of the parameter may be regarded as a random draw from its observed-data posterior distribution.

When the pattern of observed data is close to a monotone pattern, then mda.cat is preferred because it will tend to converge more quickly.

Value

an array like start containing simulated cell probabilities.

Note

IMPORTANT: The random number generator seed must be set at least once by the function rngseed before this function can be used.

References

Schafer (1996) Analysis of Incomplete Multivariate Data, Chapman & Hall, Chapter 7.

Examples

data(crimes)
x      <- crimes[,-3]
counts <- crimes[,3]
s <- prelim.cat(x,counts)        # preliminary manipulations
thetahat <- em.cat(s)            # find ML estimate under saturated model
rngseed(7817)                    # set random number generator seed
theta <- da.cat(s,thetahat,50)   # take 50 steps from MLE
ximp  <- imp.cat(s,theta)        # impute once under theta
theta <- da.cat(s,theta,50)      # take another 50 steps
ximp  <- imp.cat(s,theta)        # impute again under new theta

[Package cat version 0.0-9 Index]