R: Bayesian Latent Class Analysis via Gibbs Sampling

blca.gibbs {BayesLCA}

R Documentation

Bayesian Latent Class Analysis via Gibbs Sampling

Description

Latent class analysis (LCA) attempts to find G hidden classes in binary data X. blca.gibbs performs Gibbs sampling to sample from the parameters' true distribution.

Usage

blca.gibbs(X, G, alpha = 1, beta = 1, delta = 1, 
	   start.vals = c("prior", "single", "across"), 
	   counts.n = NULL, iter = 5000, thin = 1, 
	   accept=thin, burn.in = 100, relabel = TRUE, 
           verbose = TRUE, verbose.update = 1000)

Arguments

`X`	The data matrix. This may take one of several forms, see `data.blca`.
`G`	The number of classes to run lca for.
`alpha`, `beta`	The prior values for the data conditional on group membership. These may take several forms: a single value, recycled across all groups and columns, a vector of length G or M (the number of columns in the data), or finally, a G x M matrix specifying each prior value separately. Defaults to 1, i.e, a uniform prior, for each value.
`delta`	Prior values for the mixture components in model. Defaults to 1, i.e., a uniform prior. May be single or vector valued (of length G).
`start.vals`	Denotes how class membership is to be assigned during the initial step of the algorithm. One of three character values may be chosen: "prior", which samples parameter values from prior distributions, "single", which randomly assigns data points exclusively to one class, or "across", which assigns class membership via `runif`. Alternatively, class membership may be pre-specified, either as a vector of class membership, or as a matrix of probabilities. Defaults to "single".
`counts.n`	If data patterns have already been counted, a data matrix consisting of each unique data pattern can be supplied to the function, in addition to a vector counts.n, which supplies the corresponding number of times each pattern occurs in the data.
`iter`	The number of iterations to run the gibbs sampler for after burn-in.
`thin`	The thinning rate for samples from the distribution, in order to achieve good mixing. Should take a value greater >0 and <=1. Defaults to 1.
`accept`	Similarly to `accept`, specifies the thinning rate for samples from the distribution, in order to achieve good mixing, however, its use is discouraged. Should always agree with `sd`. Retained for backwards compatability reasons. See ‘Note’.
`burn.in`	Number of iterations to run the Gibbs sampler for before beginning to store values.
`relabel`	Logical, indicating whether a mechanism to prevent label-switching be used or not. Defaults to TRUE.
`verbose`	Logical valued. If TRUE, the current number of completed samples is printed at regular intervals.
`verbose.update`	If `verbose=TRUE`, `verbose.update` determines the periodicity with which updates are printed.

Details

The library coda provide extensive tools to diagnose and visualise MCMC chains. The generic function as.mcmc.blca.gibbs, makes blca.gibbs objects compatible with functions such as summary.mcmc and raftery.diag.

Value

A list of class "blca.gibbs" is returned, containing:

`call`	The initial call passed to the function.
`classprob`	The class probabilities.
`itemprob`	The item probabilities, conditional on class membership.
`classprob.sd`	Posterior standard deviation estimates of the class probabilities.
`itemprob.sd`	Posterior standard deviation estimates of the item probabilities.
`logpost`	The log-posterior of the estimated model.
`Z`	Estimate of class membership for each unique datapoint.
`samples`	A list containing Gibbs samples of the item and class probabilities and log-posterior.
`DIC`	The Deviance Information Criterion for the estimated model.
`BICM`	The Bayesian Information Criterion (Monte Carlo) for the estimated model.
`AICM`	Akaike's Information Criterion (Monte Carlo) for the estimated model.
`counts`	The number of times each unique datapoint point occured.
`prior`	A list containing the prior values specified for the model.
`thin`	The acceptance rate for samples from the distribution.
`burn.in`	The number of iterations the gibbs sampler was run before beginning to store values.
`relabel`	Logical, indicating whether a mechanism to prevent label-switching was used.
`labelstore`	The stored labels during the sampling run. If relabel=TRUE, these show how labels were permuted in an attempt to avoid label-switching in the model.

Note

Earlier versions of this function erroneously referred to posterior standard deviations as standard errors. This also extended to arguments supplied to and returned by the function, some of which are now returned with the corrected corrected suffix blca.em.sd (for standard deviation). For backwards compatability reasons, the earlier suffix .se has been retained as a returned argument. The argument thin replaces accept, which appeared in the earliest version of the package. This is to maintain consistency with other packages, such as rjags.

Author(s)

Arthur White

References

Spiegelhalter DJ, Best NG, Carlin BP, Linde Avd (2002). “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society. Series B (Statistical Methodology), 64(4), pp. 583-639. ISSN 13697412. URL http://www.jstor.org/stable/3088806.

Raftery AE, Newton MA, Satagopan JM, Krivitsky PN (2007). “Estimating the integrated likelihood via posterior simulation using the harmonic mean identity.” In Bayesian Statistics, pp. 1-45.

Examples

## Generate a 4-dim. sample with 2 latent classes of 500 data points each.
## The probabilities for the 2 classes are given by type1 and type2.

type1 <- c(0.8, 0.8, 0.2, 0.2)
type2 <- c(0.2, 0.2, 0.8, 0.8)
x<- rlca(1000, rbind(type1,type2), c(0.6,0.4))

## Not run: fit.gibbs<-blca.gibbs(x,2, iter=1000, burn.in=10)
## Not run: summary(fit.gibbs)
## Not run: plot(fit.gibbs)
## Not run: raftery.diag(as.mcmc(fit.gibbs))


## Not run: fit.gibbs<-blca.gibbs(x,2, iter=10000, burn.in=100, thin=0.5) 
## Not run: plot(fit.gibbs, which=4)
## Not run: raftery.diag(as.mcmc(fit.gibbs))

[Package BayesLCA version 1.9 Index]