MixNRMI1 {BNPdensity} | R Documentation |
Normalized Random Measures Mixture of Type I
Description
Bayesian nonparametric estimation based on normalized measures driven mixtures for locations.
Usage
MixNRMI1(
x,
probs = c(0.025, 0.5, 0.975),
Alpha = 1,
Kappa = 0,
Gama = 0.4,
distr.k = "normal",
distr.p0 = 1,
asigma = 0.5,
bsigma = 0.5,
delta_S = 3,
delta_U = 2,
Meps = 0.01,
Nx = 150,
Nit = 1500,
Pbi = 0.1,
epsilon = NULL,
printtime = TRUE,
extras = TRUE,
adaptive = FALSE
)
Arguments
x |
Numeric vector. Data set to which the density is fitted. |
probs |
Numeric vector. Desired quantiles of the density estimates. |
Alpha |
Numeric constant. Total mass of the centering measure. See details. |
Kappa |
Numeric positive constant. See details. |
Gama |
Numeric constant. |
distr.k |
The distribution name for the kernel. Allowed names are "normal", "gamma", "beta", "double exponential", "lognormal" or their common abbreviations "norm", "exp", or an integer number identifying the mixture kernel: 1 = Normal; 2 = Gamma; 3 = Beta; 4 = Double Exponential; 5 = Lognormal. |
distr.p0 |
The distribution name for the centering measure. Allowed names are "normal", "gamma", "beta", or their common abbreviations "norm", "exp", or an integer number identifying the centering measure: 1 = Normal; 2 = Gamma; 3 = Beta. |
asigma |
Numeric positive constant. Shape parameter of the gamma prior
on the standard deviation of the mixture kernel |
bsigma |
Numeric positive constant. Rate parameter of the gamma prior
on the standard deviation of the mixture kernel |
delta_S |
Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling sigma. |
delta_U |
Numeric positive constant. Metropolis-Hastings proposal variation coefficient for sampling the latent U. |
Meps |
Numeric constant. Relative error of the jump sizes in the continuous component of the process. Smaller values imply larger number of jumps. |
Nx |
Integer constant. Number of grid points for the evaluation of the density estimate. |
Nit |
Integer constant. Number of MCMC iterations. |
Pbi |
Numeric constant. Burn-in period proportion of Nit. |
epsilon |
Numeric constant. Extension to the evaluation grid range. See details. |
printtime |
Logical. If TRUE, prints out the execution time. |
extras |
Logical. If TRUE, gives additional objects: means, weights and Js. |
adaptive |
Logical. If TRUE, uses an adaptive MCMC strategy to sample the latent U (adaptive delta_U). |
Details
This generic function fits a normalized random measure (NRMI) mixture model for density estimation (James et al. 2009). Specifically, the model assumes a normalized generalized gamma (NGG) prior for the locations (means) of the mixture kernel and a parametric prior for the common smoothing parameter sigma, leading to a semiparametric mixture model.
The details of the model are:
X_i|Y_i,\sigma \sim k(\cdot
|Y_i,\sigma)
Y_i|P \sim P,\quad
i=1,\dots,n
P \sim \textrm{NGG(\texttt{Alpha,
Kappa, Gama; P\_0})}
\sigma \sim
\textrm{Gamma(asigma, bsigma)}
where
X_i
's are the observed data, Y_i
's are latent (location)
variables, sigma
is the smoothing parameter, k
is a parametric
kernel parameterized in terms of mean and standard deviation, (Alpha,
Kappa, Gama; P_0)
are the parameters of the NGG prior with P_0
being
the centering measure whose parameters are assigned vague hyper prior
distributions, and (asigma,bsigma)
are the hyper-parameters of the
gamma prior on the smoothing parameter sigma
. In particular:
NGG(Alpha, 1, 0; P_0)
defines a Dirichlet process; NGG(1,
Kappa, 1/2; P_0)
defines a Normalized inverse Gaussian process; and
NGG(1, 0, Gama; P_0)
defines a normalized stable process.
The evaluation grid ranges from min(x) - epsilon
to max(x) +
epsilon
. By default epsilon=sd(x)/4
.
Value
The function returns a MixNRMI1 object. It is based on a list with the following components:
xx |
Numeric vector. Evaluation grid. |
qx |
Numeric array. Matrix
of dimension |
cpo |
Numeric vector of |
R |
Numeric vector of
|
S |
Numeric vector of |
U |
Numeric vector of
|
Allocs |
List of |
means |
List of |
weights |
List of
|
Js |
List of |
Nm |
Integer constant. Number of jumps of the continuous component of the unnormalized process. |
Nx |
Integer constant. Number of grid points for the evaluation of the density estimate. |
Nit |
Integer constant. Number of MCMC iterations. |
Pbi |
Numeric constant. Burn-in period proportion of |
procTime |
Numeric vector with execution time provided by
|
distr.k |
Integer corresponding to the kernel chosen for the mixture |
data |
Data used for the fit |
NRMI_params |
A named list with the parameters of the NRMI process |
Warning
The function is computing intensive. Be patient.
Author(s)
Barrios, E., Kon Kam King, G., Lijoi, A., Nieto-Barajas, L.E. and Prüenster, I.
References
1.- Barrios, E., Lijoi, A., Nieto-Barajas, L. E. and Prünster, I. (2013). Modeling with Normalized Random Measure Mixture Models. Statistical Science. Vol. 28, No. 3, 313-334.
2.- James, L.F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measure with independent increments. Scand. J. Statist 36, 76-97.
See Also
MixNRMI2
, MixNRMI1cens
,
MixNRMI2cens
, multMixNRMI1
Examples
### Example 1
## Not run:
# Data
data(acidity)
x <- acidity
# Fitting the model under default specifications
out <- MixNRMI1(x)
# Plotting density estimate + 95% credible interval
plot(out)
### Example 2
set.seed(150520)
data(enzyme)
x <- enzyme
Enzyme1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.007, Gama = 0.5,
distr.k = "gamma", distr.p0 = "gamma",
asigma = 1, bsigma = 1, Meps=0.005,
Nit = 5000, Pbi = 0.2)
attach(Enzyme1.out)
# Plotting density estimate + 95% credible interval
plot(Enzyme1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()
## End(Not run)
### Example 3
## Do not run
# set.seed(150520)
# data(galaxy)
# x <- galaxy
# Galaxy1.out <- MixNRMI1(x, Alpha = 1, Kappa = 0.015, Gama = 0.5,
# distr.k = "normal", distr.p0 = "gamma",
# asigma = 1, bsigma = 1, delta = 7, Meps=0.005,
# Nit = 5000, Pbi = 0.2)
# The output of this run is already loaded in the package
# To show results run the following
# Data
data(galaxy)
x <- galaxy
data(Galaxy1.out)
attach(Galaxy1.out)
# Plotting density estimate + 95% credible interval
plot(Galaxy1.out)
# Plotting number of clusters
par(mfrow = c(2, 1))
plot(R, type = "l", main = "Trace of R")
hist(R, breaks = min(R - 0.5):max(R + 0.5), probability = TRUE)
# Plotting sigma
par(mfrow = c(2, 1))
plot(S, type = "l", main = "Trace of sigma")
hist(S, nclass = 20, probability = TRUE, main = "Histogram of sigma")
# Plotting u
par(mfrow = c(2, 1))
plot(U, type = "l", main = "Trace of U")
hist(U, nclass = 20, probability = TRUE, main = "Histogram of U")
# Plotting cpo
par(mfrow = c(2, 1))
plot(cpo, main = "Scatter plot of CPO's")
boxplot(cpo, horizontal = TRUE, main = "Boxplot of CPO's")
print(paste("Average log(CPO)=", round(mean(log(cpo)), 4)))
print(paste("Median log(CPO)=", round(median(log(cpo)), 4)))
detach()