R: Bayesian Shape-Restricted Spectral Analysis Quantile...

bsaqdpm {bsamGP}

R Documentation

Bayesian Shape-Restricted Spectral Analysis Quantile Regression with Dirichlet Process Mixture Errors

Description

This function fits a Bayesian semiparametric quantile regression model to estimate shape-restricted functions using a spectral analysis of Gaussian process priors. The model assumes that the errors follow a Dirichlet process mixture model.

Usage


bsaqdpm(formula, xmin, xmax, p, nbasis, nint,
mcmc = list(), prior = list(), egrid, ngrid = 500,
shape = c('Free', 'Increasing', 'Decreasing', 'IncreasingConvex', 'DecreasingConcave',
'IncreasingConcave', 'DecreasingConvex', 'IncreasingS', 'DecreasingS',
'IncreasingRotatedS', 'DecreasingRotatedS', 'InvertedU', 'Ushape'),
verbose = FALSE)

Arguments

`formula`	an object of class “`formula`”
`xmin`	a vector or scalar giving user-specific minimum values of x. The default values are minimum values of x.
`xmax`	a vector or scalar giving user-specific maximum values of x. The default values are maximum values of x.
`p`	quantile of interest (default=0.5).
`nbasis`	number of cosine basis functions.
`nint`	number of grid points where the unknown function is evaluated for plotting. The default is 200.
`mcmc`	a list giving the MCMC parameters. The list includes the following integers (with default values in parentheses): `nblow0 (1000)` giving the number of initialization period for adaptive metropolis, `maxmodmet (5)` giving the maximum number of times to modify metropolis, `nblow (10000)` giving the number of MCMC in transition period, `nskip (10)` giving the thinning interval, `smcmc (1000)` giving the number of MCMC for analysis, and `ndisp (1000)` giving the number of saved draws to be displayed on screen (the function reports on the screen when every `ndisp` iterations have been carried out).
`prior`	a list giving the prior information. The list includes the following parameters (default values specify the non-informative prior): `iflagprior` choosing a smoothing prior for spectral coefficients (iflagprior=0 assigns T-Smoother prior (default), iflagprior=1 chooses Lasso-Smoother prior), `theta0_m0` and `theta0_s0` giving the hyperparameters for prior distribution of the spectral coefficients (`theta0_m0` and `theta0_s0` are used when the functions have shape-restriction), `tau2_m0`, `tau2_s0` and `w0` giving the prior mean and standard deviation of smoothing prior (When iflagprior=1, tau2_m0 is only used as the hyperparameter), `beta_m0` and `beta_v0` giving the hyperparameters of the multivariate normal distribution for parametric part including intercept, `sigma2_m0` and `sigma2_v0` giving the prior mean and variance of the inverse gamma prior for the scale parameter of response, `alpha_m0` and `alpha_s0` giving the prior mean and standard deviation of the truncated normal prior distribution for the constant of integration, `iflagpsi` determining the prior of slope for logisitic function in `S` or `U` shaped (iflagpsi=1 (default), slope `\psi` is sampled and iflagpsi=0, `\psi` is fixed), `psifixed` giving initial value (iflagpsi=1) or fixed value (iflagpsi=0) of slope, `omega_m0` and `omega_s0` giving the prior mean and standard deviation of the truncated normal prior distribution for the inflection point of `S` or `U` shaped function.
`egrid`	a vector giving grid points where the residual density estimate is evaluated. The default range is from -10 to 10.
`ngrid`	a vector giving number of grid points where the residual density estimate is evaluated. The default value is 500.
`shape`	a vector giving types of shape restriction.
`verbose`	a logical variable. If `TRUE`, the iteration number and the Metropolis acceptance rate are printed to the screen.

Details

This generic function fits a Bayesian spectral analysis quantile regression model for estimating shape-restricted functions using Gaussian process priors. For enforcing shape-restrictions, the model assumes that the derivatives of the functions are squares of Gaussian processes. The model also assumes that the errors follow a Dirichlet process mixture model.

Let y_i and w_i be the response and the vector of parametric predictors, respectively. Further, let x_{i,k} be the covariate related to the response through an unknown shape-restricted function. The model for estimating shape-restricted functions is as follows.

y_i = w_i^T\beta + \sum_{k=1}^K f_k(x_{i,k}) + \epsilon_i, ~ i=1,\ldots,n,

where f_k is an unknown shape-restricted function of the scalar x_{i,k} \in [0,1] and the error terms \{\epsilon_i\} are a random sample from a Dirichlet process mixture of an asymmetric Laplace distribution, ALD_p(0,\sigma^2), which has the following probability density function:

\epsilon_i \sim f(\epsilon) = \int ALD_p(\epsilon; 0,\sigma^2)dG(\sigma^2),

G \sim DP(M,G0), ~~ G0 = Ga\left(\sigma^{-2}; \frac{r_{0,\sigma}}{2},\frac{s_{0,\sigma}}{2}\right).

The prior of function without shape restriction is:

f(x) = Z(x),

where Z is a second-order Gaussian process with mean function equal to zero and covariance function \nu(s,t) = E[Z(s)Z(t)] for s, t \in [0, 1]. The Gaussian process is expressed with the spectral representation based on cosine basis functions:

Z(x) = \sum_{j=0}^\infty \theta_j\varphi_j(x)

\varphi_0(x) = 1 ~~ \code{and} ~~ \varphi_j(x) = \sqrt{2}\cos(\pi j x), ~ j \ge 1, ~ 0 \le x \le 1

The shape-restricted functions are modeled by assuming the qth derivatives of f are squares of Gaussian processes:

f^{(q)}(x) = \delta Z^2(x)h(x), ~~ \delta \in \{1, -1\}, ~~ q \in \{1, 2\},

where h is the squish function. For monotonic, monotonic convex, and concave functions, h(x)=1, while for S and U shaped functions, h is defined by

h(x) = \frac{1 - \exp[\psi(x - \omega)]}{1 + \exp[\psi(x - \omega)]}, ~~ \psi > 0, ~~ 0 < \omega < 1

For the spectral coefficients of functions without shape constraints, the scale-invariant prior is used (The intercept is included in \beta):

\theta_j | \tau, \gamma \sim N(0, \tau^2\exp[-j\gamma]), ~ j \ge 1

The priors for the spectral coefficients of shape restricted functions are:

\theta_0 \sim N(m_{\theta_0}, v^2_{\theta_0}), \quad \theta_j | \tau, \gamma \sim N(m_{\theta_j}, \tau^2\exp[-j\gamma]), ~ j \ge 1

To complete the model specification, the popular normal prior is assumed for \beta:

\beta | \sim N(m_{0,\beta}, V_{0,\beta})

Value

An object of class bsam representing the Bayesian spectral analysis model fit. Generic functions such as print, fitted and plot have methods to show the results of the fit.

The MCMC samples of the parameters in the model are stored in the list mcmc.draws, the posterior samples of the fitted values are stored in the list fit.draws, and the MCMC samples for the log marginal likelihood are saved in the list loglik.draws. The output list also includes the following objects:

`post.est`	posterior estimates for all parameters in the model.
`lpml`	log pseudo marginal likelihood using Mukhopadhyay and Gelfand method.
`rsquarey`	correlation between `y` and `\hat{y}`.
`imodmet`	the number of times to modify Metropolis.
`pmet`	proportion of `\theta` accepted after burn-in.
`call`	the matched call.
`mcmctime`	running time of Markov chain from `system.time()`.

References

Jo, S., Choi, T., Park, B. and Lenk, P. (2019). bsamGP: An R Package for Bayesian Spectral Analysis Models Using Gaussian Process Priors. Journal of Statistical Software, 90, 310-320.

Kozumi, H. and Kobayashi, G. (2011) Gibbs sampling methods for Bayesian quantile regression. Journal of Statistical Computation and Simulation, 81(11), 1565-1578.

Lenk, P. and Choi, T. (2017) Bayesian Analysis of Shape-Restricted Functions using Gaussian Process Priors. Statistica Sinica, 27, 43-69.

MacEachern, S. N. and Müller, P. (1998) Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7, 223-238.

Mukhopadhyay, S. and Gelfand, A. E. (1997) Dirichlet process mixed generalized linear models. Journal of the American Statistical Association, 92, 633-639.

Neal, R. M. (2000) Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9, 249-265.

Examples

## Not run: 
######################
# Increasing-concave #
######################

# Simulate data
set.seed(1)

n <- 500
x <- runif(n)
e <- c(rald(n/2, scale = 0.5, p = 0.5),
       rald(n/2, scale = 3, p = 0.5))
y <- log(1 + 10*x) + e

# Number of cosine basis functions
nbasis <- 50

# Fit the model with default priors and mcmc parameters
fout1 <- bsaqdpm(y ~ fs(x), p = 0.25, nbasis = nbasis,
                 shape = 'IncreasingConcave')
fout2 <- bsaqdpm(y ~ fs(x), p = 0.5, nbasis = nbasis,
                 shape = 'IncreasingConcave')
fout3 <- bsaqdpm(y ~ fs(x), p = 0.75, nbasis = nbasis,
                 shape = 'IncreasingConcave')

# fitted values
fit1 <- fitted(fout1)
fit2 <- fitted(fout2)
fit3 <- fitted(fout3)

# plots
plot(x, y, lwd = 2, xlab = 'x', ylab = 'y')
lines(fit1$xgrid, fit1$wbeta$mean[1] + fit1$fxgrid$mean, lwd=2, col=2)
lines(fit2$xgrid, fit2$wbeta$mean[1] + fit2$fxgrid$mean, lwd=2, col=3)
lines(fit3$xgrid, fit3$wbeta$mean[1] + fit3$fxgrid$mean, lwd=2, col=4)
legend('topleft',legend=c('1st Quartile','2nd Quartile','3rd Quartile'),
       lwd=2, col=2:4, lty=1)


## End(Not run)

[Package bsamGP version 1.2.5 Index]