R: Impute missing categorical data

imp.cat {cat}

R Documentation

Impute missing categorical data

Description

Performs single random imputation of missing values in a categorical dataset under a user-supplied value of the underlying cell probabilities.

Usage

imp.cat(s, theta)

Arguments

`s`	summary list of an incomplete categorical dataset created by the function `prelim.cat`.
`theta`	parameter value under which the missing data are to be imputed. This is an array of cell probabilities of dimension `s$d` whose elements sum to one, such as produced by `em.cat`, `ecm.cat`, `da.cat`, `mda.cat` or `dabipf`.

Details

Missing data are drawn independently for each observational unit from their conditional predictive distribution given the observed data and theta.

Value

If the original incomplete dataset was in ungrouped format (s$grouped=F), then a matrix like s$x except that all NAs have been filled in.

If the original dataset was grouped, then a list with the following components:

`x`	Matrix of levels for categorical variables
`counts`	vector of length `nrow(x)` containing frequencies or counts corresponding to the levels in `x`.

Note

IMPORTANT: The random number generator seed must be set by the function rngseed at least once in the current session before this function can be used.

Examples

data(crimes)
x      <- crimes[,-3]
counts <- crimes[,3]
s <- prelim.cat(x,counts)        # preliminary manipulations
thetahat <- em.cat(s)            # find ML estimate under saturated model
rngseed(7817)                    # set random number generator seed
theta <- da.cat(s,thetahat,50)   # take 50 steps from MLE
ximp  <- imp.cat(s,theta)        # impute once under theta
theta <- da.cat(s,theta,50)      # take another 50 steps
ximp  <- imp.cat(s,theta)        # impute again under new theta

[Package cat version 0.0-9 Index]