predgc {gcKrig} | R Documentation |
Prediction at Unobserved Locations in Gaussian Copula Models for Geostatistical Count Data
Description
Computes the plug-in prediction at unobserved sites. Two methods are implemented. If
method = 'GHK'
then the maximum simulated likelihood estimates are computed and
the sequential importance sampling method is used in the integral evaluation. If
method = 'GQT'
then the maximum surrogate likelihood estimates are computed and the
generalized quantile transform method is used in integral approximation.
Usage
predgc(obs.y, obs.x = NULL, obs.locs, pred.x = NULL, pred.locs,
longlat = FALSE, distscale = 1, marginal, corr, obs.effort = 1,
pred.effort = 1, method = "GHK", estpar = NULL, corrpar0 = NULL,
pred.interval = NULL, parallel = FALSE,
ghkoptions = list(nrep = c(100,1000), reorder = FALSE, seed = 12345),
paralleloptions = list(n.cores = 2, cluster.type = "SOCK"))
Arguments
obs.y |
a non-negative integer vector of observed response with its length equals to the number of observed locations. |
obs.x |
a numeric matrix or data frame of covariates at observed locations,
with its number of rows equals to the number of observed locations.
If no covariates then |
obs.locs |
a numeric matrix or data frame of observed locations.obs.effort The first column is x or longitude, the second column is y or latitude. The number of observed locations is equal to the number of rows. |
pred.x |
a numeric matrix or data frame of covariates at prediction locations,
with its number of rows equals to the number of prediction locations.
If no covariates then |
pred.locs |
a numeric matrix or data frame of prediction locations. First column is x or longitude, second column is y or latitude. The number of prediction locations equals to the number of rows. |
longlat |
if FALSE, use Euclidean distance, if TRUE use great circle distance. The default is FALSE. |
distscale |
a numeric scaling factor for computing distance. If original distance is in kilometers, then
|
marginal |
an object of class |
corr |
an object of class |
obs.effort |
sampling effort at observed locations. For binomial marginal it is the size parameter (number of trials). See details. |
pred.effort |
sampling effort at prediction locations. For binomial marginal it is the size parameter (number of trials). See details. |
method |
two methods are implemented. If
|
estpar |
if |
corrpar0 |
the starting value of correlation parameters in optimization procedure.
If |
pred.interval |
a number between |
parallel |
if |
ghkoptions |
a list of three elements that only need to be specified if
|
paralleloptions |
a list of two elements that only need to be specified if
|
Details
This program implemented two methods in predicting the response at unobserved sites. See mlegc
.
The argument obs.effort
and pred.effort
are the sampling effort (known). It can be used to consider heterogeneity of
the measurement time or area at different locations. The default is 1 for all locations.
See Han and De Oliveira (2016) for more details.
The program computes two types of prediction intervals at a given confidence level. The shortest prediction interval is obtained from evaluating the highest to lowest prediction densities; the equal tail prediction interval has equal tail probabilities.
Value
A list of class "predgc" with the following elements:
obs.locs |
observed locations. |
obs.y |
observed values at observed locations. |
pred.locs |
prediction locations. |
predValue |
the expectation of the conditional predictive distribution. |
predCount |
predicted counts; the closest integer that |
predVar |
estimated variance of the prediction at prediction locations. |
ConfidenceLevel |
confidence level (between 0 to 1) if prediction interval is computed. |
predInterval.EqualTail |
equal-tail prediction interval. |
predInterval.Shortest |
shortest length prediction interval. |
Author(s)
Zifei Han hanzifei1@gmail.com
References
Han, Z. and De Oliveira, V. (2016) On the correlation structure of Gaussian copula models for geostatistical count data. Australian and New Zealand Journal of Statistics, 58:47-69.
Kazianka, H. and Pilz, J. (2010) Copula-based geostatistical modeling of continuous and discrete data including covariates. Stoch Environ Res Risk Assess 24:661-673.
Kazianka, H. (2013) Approximate copula-based estimation and prediction of discrete spatial data. Stoch Environ Res Risk Assess 27:2015-2026.
Masarotto, G. and Varin, C. (2012) Gaussian copula marginal regression. Electronic Journal of Statistics 6:1517-1549. https://projecteuclid.org/euclid.ejs/1346421603.
Masarotto, G. and Varin C. (2017). Gaussian Copula Regression in R. Journal of Statistical Software, 77(8), 1–26. doi: 10.18637/jss.v077.i08.
Han, Z. and De Oliveira, V. (2018) gcKrig: An R Package for the Analysis of Geostatistical Count Data Using Gaussian Copulas. Journal of Statistical Software, 87(13), 1–32. doi: 10.18637/jss.v087.i13.
See Also
Examples
## Not run:
## For fast check predict at four locations only
data(Weed95)
weedobs <- Weed95[Weed95$dummy==1, ]
weedpred <- Weed95[Weed95$dummy==0, ]
predweed1 <- predgc(obs.y = weedobs$weedcount, obs.x = weedobs[,4:5], obs.locs = weedobs[,1:2],
pred.x = weedpred[1:4,4:5], pred.locs = weedpred[1:4,1:2],
marginal = negbin.gc(link = 'log'), pred.interval = 0.9,
corr = matern.gc(kappa = 0.5, nugget = TRUE), method = 'GHK')
#summary(predweed1)
#plot(predweed1)
## Time consuming examples
## Weed prediction at 200 locations using parallel programming
predweed2 <- predgc(obs.y = weedobs$weedcount, obs.x = weedobs[,4:5], obs.locs = weedobs[,1:2],
pred.x = weedpred[,4:5], pred.locs = weedpred[,1:2],
marginal = negbin.gc(link = 'log'),
corr = matern.gc(kappa = 0.5, nugget = TRUE), method = 'GHK',
pred.interval = 0.95, parallel = TRUE,
paralleloptions = list(n.cores = 4))
#summary(predweed2)
#plot(predweed2)
## A more time consuming example for generating a prediction map at a fine grid
data(OilWell)
gridstep <- seq(0.5, 30.5, length = 40)
locOilpred <- data.frame(Easting = expand.grid(gridstep, gridstep)[,1],
Northing = expand.grid(gridstep, gridstep)[,2])
PredOil <- predgc(obs.y = OilWell[,3], obs.locs = OilWell[,1:2], pred.locs = locOilpred,
marginal = binomial.gc(link = 'logit'),
corr = matern.gc(nugget = FALSE), obs.effort = 1,
pred.effort = 1, method = 'GHK',
parallel = TRUE, paralleloptions = list(n.cores = 4))
PredMat <- summary(PredOil)
## To generate better prediction maps
library(colorspace)
filled.contour(seq(0.5,30.5,length=40), seq(0.5,30.5,length=40),
matrix(PredMat$predMean,40,), zlim = c(0, 1), col=rev(heat_hcl(12)),
nlevels=12, xlab = "Eastings", ylab = "Northings",
plot.axes = {axis(1); axis(2); points(OilWell[,1:2], col = 1,
cex = 0.25 + 0.25*OilWell[,3])})
filled.contour(seq(0.5,30.5,length=40), seq(0.5,30.5,length=40),
matrix(PredMat$predVar,40,),
zlim = c(0, 0.3), col = rev(heat_hcl(12)), nlevels = 10,
xlab = "Eastings", ylab = "Northings",
plot.axes = {axis(1); axis(2); points(OilWell[,1:2], col = 1,
cex = 0.25 + 0.25*OilWell[,3])})
## End(Not run)