R: Normalized Random Measures Mixture of Type I

MixNRMI1 {BNPdensity}

R Documentation

Normalized Random Measures Mixture of Type I

Description

Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.

Usage

MixNRMI1(
  x,
  probs = c(0.025, 0.5, 0.975),
  Alpha = 1,
  Kappa = 0,
  Gama = 0.4,
  distr.k = "normal",
  distr.p0 = 1,
  asigma = 0.5,
  bsigma = 0.5,
  delta_S = 3,
  delta_U = 2,
  Meps = 0.01,
  Nx = 150,
  Nit = 1500,
  Pbi = 0.1,
  epsilon = NULL,
  printtime = TRUE,
  extras = TRUE,
  adaptive = FALSE
)

Arguments

`x`	Numeric vector. Data set to which the density is fitted.
`probs`	Numeric vector. Desired quantiles of the density estimates.
`Alpha`	Numeric constant. Total mass of the centering measure. See details.
`Kappa`	Numeric positive constant. See details.
`Gama`	Numeric constant. `0\leq \texttt{Gama} \leq 1`. See details.
`distr.k`	The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal.
`distr.p0`	The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta.
`asigma`	Numeric positive constant. Shape parameter of the gamma prior on the standard deviation of the mixture kernel `distr.k`.
`bsigma`	Numeric positive constant. Rate parameter of the gamma prior on the standard deviation of the mixture kernel `distr.k`.
`delta_S`	Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma.
`delta_U`	Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U.
`Meps`	Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps.
`Nx`	Integer constant. Number of grid points for the evaluation of the density estimate.
`Nit`	Integer constant. Number of MCMC iterations.
`Pbi`	Numeric constant. Burn-in period proportion of Nit.
`epsilon`	Numeric constant. Extension to the evaluation grid range. See details.
`printtime`	Logical. If TRUE, prints out the execution time.
`extras`	Logical. If TRUE, gives additional objects: means, weights and Js.
`adaptive`	Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U).

Details

This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.

The details of the model are:

X_i|Y_i,\sigma \sim k(\cdot |Y_i,\sigma)

Y_i|P \sim P,\quad i=1,\dots,n

P \sim \textrm{NGG(\texttt{Alpha, Kappa, Gama; P\_0})}

\sigma \sim \textrm{Gamma(asigma, bsigma)}

where X_i's are the observed data, Y_i's are latent (location) variables, sigma is the smoothing parameter, k is a parametric kernel parameterized in terms of mean and standard deviation, (Alpha, Kappa, Gama; P_0) are the parameters of the NGG prior with P_0 being the centering measure whose parameters are assigned vague hyper prior distributions, and (asigma,bsigma) are the hyper-parameters of the gamma prior on the smoothing parameter sigma. In particular: NGG(Alpha, 1, 0; P_0) defines a Dirichlet process; NGG(1, Kappa, 1/2; P_0) defines a Normalized inverse Gaussian process; and NGG(1, 0, Gama; P_0) defines a normalized stable process.

The evaluation grid ranges from min(x) - epsilon to max(x) + epsilon. By default epsilon=sd(x)/4.

Value

The function returns a MixNRMI1 object. It is based on a list with the following components:

`xx`	Numeric vector. Evaluation grid.
`qx`	Numeric array. Matrix of dimension `\texttt{Nx} \times (\texttt{length(probs)} + 1)` with the posterior mean and the desired quantiles input in `probs`.
`cpo`	Numeric vector of `length(x)` with conditional predictive ordinates.
`R`	Numeric vector of `length(Nit*(1-Pbi))` with the number of mixtures components (clusters).
`S`	Numeric vector of `length(Nit*(1-Pbi))` with the values of common standard deviation sigma.
`U`	Numeric vector of `length(Nit*(1-Pbi))` with the values of the latent variable U.
`Allocs`	List of `length(Nit*(1-Pbi))` with the clustering allocations.
`means`	List of `length(Nit*(1-Pbi))` with the cluster means (locations). Only if extras = TRUE.
`weights`	List of `length(Nit*(1-Pbi))` with the mixture weights. Only if extras = TRUE.
`Js`	List of `length(Nit*(1-Pbi))` with the unnormalized weights (jump sizes). Only if extras = TRUE.
`Nm`	Integer constant. Number of jumps of the continuous component of the unnormalized process.
`Nx`	Integer constant. Number of grid points for the evaluation of the density estimate.
`Nit`	Integer constant. Number of MCMC iterations.
`Pbi`	Numeric constant. Burn-in period proportion of `Nit`.
`procTime`	Numeric vector with execution time provided by `proc.time` function.
`distr.k`	Integer corresponding to the kernel chosen for the mixture
`data`	Data used for the fit
`NRMI_params`	A named list with the parameters of the NRMI process

Warning

The function is computing intensive. Be patient.

Author(s)

Barrios, E., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.

References

1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.

2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.

Examples


### Example 1
## Not run: 
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1(x)
# Plotting density estimate + 95% credible interval
plot(out)
### Example 2
set.seed(150520)
data(enzyme)
x <- enzyme
Enzyme1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
                         distr.k = "gamma", distr.p0 = "gamma",
                         asigma = 1, bsigma = 1, Meps=0.005,
                         Nit = 5000, Pbi = 0.2)
attach(Enzyme1.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

## End(Not run)

### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
#  Galaxy1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
#                          distr.k = "normal", distr.p0 = "gamma",
#                          asigma = 1, bsigma = 1, delta = 7, Meps=0.005,
#                          Nit = 5000, Pbi = 0.2)

# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy1.out)
attach(Galaxy1.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()

[Package BNPdensity version 2023.3.8 Index]