R: Bayesian Inference on Univariate Normal Mixtures

Nmix {Nmix}

R Documentation

Bayesian Inference on Univariate Normal Mixtures

Description

Wrapper for Nmix Fortran program that uses a Reversible jump Markov chain sampler to simulate from the posterior distribution of a univariate normal mixture model

Usage

Nmix(y,tag="",seed=0,nsweep=10000,nburnin=0,
	kinit=1,qempty=1,qprior=0,qunif=0,qfix=0,qrkpos=0,qrange=1,qkappa=0,qbeta=1,
	alpha=2,beta=0.02,delta=1,eee=0,fff=0,ggg=0.2,
	hhh=10,unhw=1.0,kappa=1.0,lambda=-1,xi=0.0,sp=1,
	out="Dkdep",nspace=nsweep%/%1000,
	nmax=length(y),ncmax=30,ncmax2=10,ncd=7,ngrid=200,k1k2=c(2,8),
	idebug=-1,qdebug=0)

Arguments

`y`	either (i) a numerical data vector, (ii) a character scalar naming a numerical data vector in the global environment or (iii) a character scalar identifying a file y.dat in the current working directory containing a dataset
`tag`	name for the dataset, in the case that `y` is a numerical vector
`seed`	positive integer to set random number seed for a reproducible run, or 0 to initialise this process; output value can be used to replicate run subsequently
`nsweep`	number of sweeps
`nburnin`	length of burn in
`kinit`	integer, initial number of components
`qempty`	integer, 1 or 0 according to whether the empty-component birth/death moves should be used
`qprior`	integer, 1 or 0 according to whether the prior should be simulated instead of the posterior
`qunif`	integer, 1 or 0 according to whether the uniform proposals should be used for the component means instead of gaussian ones
`qfix`	integer, 1 or 0 according to whether the number of components should be held fixed (at the value of `kinit`)
`qrkpos`	integer, 1 or 0 according to whether the the number of non-empty components should be reported throughout
`qrange`	integer, 1 or 0 according to whether range-based parameter priors should be used
`qkappa`	integer, 1 or 0 according to whether `kappa` should be updated
`qbeta`	integer, 1 or 0 according to whether `beta` should be updated
`alpha`	numeric, set value of parameter alpha
`beta`	numeric, set value of parameter beta
`delta`	numeric, set value of parameter delta
`eee`	numeric, set value of parameter e
`fff`	numeric, set value of parameter f
`ggg`	numeric, set value of parameter g
`hhh`	numeric, set value of parameter h
`unhw`	numeric, set value of half-width for uniform proposals
`kappa`	numeric, set value of parameter kappa
`lambda`	numeric, set value of parameter lambda; the value -1 (the default) means a prior for k uniform on 1,2,...ncmax
`xi`	numeric, set value of parameter xi
`sp`	numeric, set value of parameter s
`out`	character string to specify optional output: string containing letters 'D','C','A','p','k','d','e','a' (any others are ignored); "*" is equivalent to "DCApkeda". See Details.
`nspace`	spacing between samples recorded in time-series traces (see Details)
`nmax`	integer, set upper bound for `n`
`ncmax`	integer, set upper bound for `k`; the same as `kmax` in the references
`ncmax2`	integer, set upper bound for `k` in output components `pe` and `avn`
`ncd`	integer, set number of conditional densities computed
`ngrid`	integer, set number of grid points for density evaluation
`k1k2`	vector of 2 integers, set minimum and maximum number of components for classification calculation
`idebug`	integer, number of sweep from which to print debugging information
`qdebug`	integer 1 or 0 according to whether debugging information is to be printed

Details

Output options: Summaries

letter		output component
D	density	`den`
C	classification	`pcl` and `scl`
A	average component occupancy	`avn`

Traces

letter		component of `traces`
p	parameters	`pars`
k	number of components	`k`
d	deviance	`deviance`
e	entropy	`entropy`
a	allocations	`alloc`

Value

An object of class nmix. List with numerous components, including

`post`	posterior distribution of number of components `k`
`pe`	list whose `k`'th component is a `k` by 3 matrix of estimated posterior means of weights, means and sd's for a mixture with `k` components
`den`	matrix of density estimates for `k=1,2,...,6` and overall, preceded by row of abcissae at which they are evaluated - only when `out` includes "D"
`avn`	order-`ncmax2` square matrix with `(i,j)` entry the posterior expected number of observations allocated to component `i` when there are `j` components in the mixture - only when `out` includes "A"
`traces`	list of named vectors, traces of selected statistics `k`, `entropy` (as defined in Green and Richardson, 2001), etc, sub-sampled to every `nspace` sweeps
`iflag`	integer flagging successful completion of simulation (0) or not (1)

Author(s)

Peter J. Green

References

Richardson, S. and Green, P. J. On Bayesian analysis of mixtures with an unknown number of components (with discussion), J. R. Statist. Soc. B, 1997, 59, 731-792; see also the correction in J. R. Statist. Soc. B, 1998, 60, 661.

Green, P. J. and Richardson, S. Modelling heterogeneity with and without the Dirichlet process, Scandinavian Journal of Statistics, 2001, 28, 355-375.

The author is grateful to Peter Soerensen for providing the interface to the C i/o routines used here, borrowed from his package qgg.

Examples

data(galx)
z<-Nmix('galx',nsweep=10000,nburnin=1000,out="Dkd")
print(z)
summary(z)
plot(z)

[Package Nmix version 2.0.5 Index]