R: Monte Carlo Maximum Likelihood estimation for the Poisson...

poisson.log.MCML {PrevMap}

R Documentation

Monte Carlo Maximum Likelihood estimation for the Poisson model

Description

This function performs Monte Carlo maximum likelihood (MCML) estimation for the geostatistical Poisson model with log link function.

Usage

poisson.log.MCML(
  formula,
  units.m = NULL,
  coords,
  data,
  ID.coords = NULL,
  par0,
  control.mcmc,
  kappa,
  fixed.rel.nugget = NULL,
  start.cov.pars,
  method = "BFGS",
  low.rank = FALSE,
  knots = NULL,
  messages = TRUE,
  plot.correlogram = TRUE
)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted.
`units.m`	an object of class `formula` indicating the multiplicative offset for the mean of the Poisson model; if not specified this is then internally set as 1.
`coords`	an object of class `formula` indicating the geographic coordinates.
`data`	a data frame containing the variables in the model.
`ID.coords`	vector of ID values for the unique set of spatial coordinates obtained from `create.ID.coords`. These must be provided if, for example, spatial random effects are defined at location-level but some of the covariates are at individual level. Warning: the spatial coordinates must all be distinct otherwise see `jitterDupCoords`. Default is `NULL`.
`par0`	parameters of the importance sampling distribution: these should be given in the following order `c(beta,sigma2,phi,tau2)`, where `beta` are the regression coefficients, `sigma2` is the variance of the Gaussian process, `phi` is the scale parameter of the spatial correlation and `tau2` is the variance of the nugget effect (if included in the model).
`control.mcmc`	output from `control.mcmc.MCML`.
`kappa`	fixed value for the shape parameter of the Matern covariance function.
`fixed.rel.nugget`	fixed value for the relative variance of the nugget effect; `fixed.rel.nugget=NULL` if this should be included in the estimation. Default is `fixed.rel.nugget=NULL`.
`start.cov.pars`	a vector of length two with elements corresponding to the starting values of `phi` and the relative variance of the nugget effect `nu2`, respectively, that are used in the optimization algorithm. If `nu2` is fixed through `fixed.rel.nugget`, then `start.cov.pars` represents the starting value for `phi` only.
`method`	method of optimization. If `method="BFGS"` then the `maxBFGS` function is used; otherwise `method="nlminb"` to use the `nlminb` function. Default is `method="BFGS"`.
`low.rank`	logical; if `low.rank=TRUE` a low-rank approximation of the Gaussian spatial process is used when fitting the model. Default is `low.rank=FALSE`.
`knots`	if `low.rank=TRUE`, `knots` is a matrix of spatial knots that are used in the low-rank approximation. Default is `knots=NULL`.
`messages`	logical; if `messages=TRUE` then status messages are printed on the screen (or output device) while the function is running. Default is `messages=TRUE`.
`plot.correlogram`	logical; if `plot.correlogram=TRUE` the autocorrelation plot of the samples of the random effect is displayed after completion of conditional simulation. Default is `plot.correlogram=TRUE`.

Details

This function performs parameter estimation for a geostatistical Poisson model with log link function. Conditionally on a zero-mean stationary Gaussian process $S(x)$ and mutually independent zero-mean Gaussian variables $Z$ with variance tau2, the observations y are generated from a Poisson distribution with mean $m\lambda$ , where $m$ is an offset defined through the argument units.m. A canonical log link is used, thus the linear predictor assumes the form

$\log(\lambda) = d'\beta + S(x) + Z,$

where $d$ is a vector of covariates with associated regression coefficients $\beta$ . The Gaussian process $S(x)$ has isotropic Matern covariance function (see matern) with variance sigma2, scale parameter phi and shape parameter kappa. In the poisson.log.MCML function, the shape parameter is treated as fixed. The relative variance of the nugget effect, nu2=tau2/sigma2, can also be fixed through the argument fixed.rel.nugget; if fixed.rel.nugget=NULL, then the relative variance of the nugget effect is also included in the estimation.

Monte Carlo Maximum likelihood. The Monte Carlo maximum likelihood method uses conditional simulation from the distribution of the random effect $T(x) = d(x)'\beta+S(x)+Z$ given the data y, in order to approximate the high-dimensiional intractable integral given by the likelihood function. The resulting approximation of the likelihood is then maximized by a numerical optimization algorithm which uses analytic epression for computation of the gradient vector and Hessian matrix. The functions used for numerical optimization are maxBFGS (method="BFGS"), from the maxLik package, and nlminb (method="nlminb").

Low-rank approximation. In the case of very large spatial data-sets, a low-rank approximation of the Gaussian spatial process $S(x)$ might be computationally beneficial. Let $(x_{1},\dots,x_{m})$ and $(t_{1},\dots,t_{m})$ denote the set of sampling locations and a grid of spatial knots covering the area of interest, respectively. Then $S(x)$ is approximated as $\sum_{i=1}^m K(\|x-t_{i}\|; \phi, \kappa)U_{i}$ , where $U_{i}$ are zero-mean mutually independent Gaussian variables with variance sigma2 and $K(.;\phi, \kappa)$ is the isotropic Matern kernel (see matern.kernel). Since the resulting approximation is no longer a stationary process (but only approximately), the parameter sigma2 is then multiplied by a factor constant.sigma2 so as to obtain a value that is closer to the actual variance of $S(x)$ .

Value

An object of class "PrevMap". The function summary.PrevMap is used to print a summary of the fitted model. The object is a list with the following components:

estimate: estimates of the model parameters; use the function coef.PrevMap to obtain estimates of covariance parameters on the original scale.

covariance: covariance matrix of the MCML estimates.

log.lik: maximum value of the log-likelihood.

y: observations.

units.m: offset.

D: matrix of covariates.

ID.coords: set of ID values defined through the argument ID.coords.

coords: matrix of the observed sampling locations.

method: method of optimization used.

kappa: fixed value of the shape parameter of the Matern function.

knots: matrix of the spatial knots used in the low-rank approximation.

const.sigma2: adjustment factor for sigma2 in the low-rank approximation.

h: vector of the values of the tuning parameter at each iteration of the Langevin-Hastings MCMC algorithm; see Laplace.sampling, or Laplace.sampling.lr if a low-rank approximation is used.

samples: matrix of the random effects samples from the importance sampling distribution used to approximate the likelihood function.

fixed.rel.nugget: fixed value for the relative variance of the nugget effect.

call: the matched call.

Author(s)

Emanuele Giorgi e.giorgi@lancaster.ac.uk

Peter J. Diggle p.diggle@lancaster.ac.uk

References

Diggle, P.J., Giorgi, E. (2019). Model-based Geostatistics for Global Public Health. CRC/Chapman & Hall.

Giorgi, E., Diggle, P.J. (2017). PrevMap: an R package for prevalence mapping. Journal of Statistical Software. 78(8), 1-29. doi: 10.18637/jss.v078.i08

Christensen, O. F. (2004). Monte carlo maximum likelihood in model-based geostatistics. Journal of Computational and Graphical Statistics 13, 702-718.

Higdon, D. (1998). A process-convolution approach to modeling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics 5, 173-190.