| predgc {gcKrig} | R Documentation |
Prediction at Unobserved Locations in Gaussian Copula Models for Geostatistical Count Data
Description
Computes the plug-in prediction at unobserved sites. Two methods are implemented. If
method = 'GHK' then the maximum simulated likelihood estimates are computed and
the sequential importance sampling method is used in the integral evaluation. If
method = 'GQT' then the maximum surrogate likelihood estimates are computed and the
generalized quantile transform method is used in integral approximation.
Usage
predgc(obs.y, obs.x = NULL, obs.locs, pred.x = NULL, pred.locs,
longlat = FALSE, distscale = 1, marginal, corr, obs.effort = 1,
pred.effort = 1, method = "GHK", estpar = NULL, corrpar0 = NULL,
pred.interval = NULL, parallel = FALSE,
ghkoptions = list(nrep = c(100,1000), reorder = FALSE, seed = 12345),
paralleloptions = list(n.cores = 2, cluster.type = "SOCK"))
Arguments
obs.y |
a non-negative integer vector of observed response with its length equals to the number of observed locations. |
obs.x |
a numeric matrix or data frame of covariates at observed locations,
with its number of rows equals to the number of observed locations.
If no covariates then |
obs.locs |
a numeric matrix or data frame of observed locations.obs.effort The first column is x or longitude, the second column is y or latitude. The number of observed locations is equal to the number of rows. |
pred.x |
a numeric matrix or data frame of covariates at prediction locations,
with its number of rows equals to the number of prediction locations.
If no covariates then |
pred.locs |
a numeric matrix or data frame of prediction locations. First column is x or longitude, second column is y or latitude. The number of prediction locations equals to the number of rows. |
longlat |
if FALSE, use Euclidean distance, if TRUE use great circle distance. The default is FALSE. |
distscale |
a numeric scaling factor for computing distance. If original distance is in kilometers, then
|
marginal |
an object of class |
corr |
an object of class |
obs.effort |
sampling effort at observed locations. For binomial marginal it is the size parameter (number of trials). See details. |
pred.effort |
sampling effort at prediction locations. For binomial marginal it is the size parameter (number of trials). See details. |
method |
two methods are implemented. If
|
estpar |
if |
corrpar0 |
the starting value of correlation parameters in optimization procedure.
If |
pred.interval |
a number between |
parallel |
if |
ghkoptions |
a list of three elements that only need to be specified if
|
paralleloptions |
a list of two elements that only need to be specified if
|
Details
This program implemented two methods in predicting the response at unobserved sites. See mlegc.
The argument obs.effort and pred.effort
are the sampling effort (known). It can be used to consider heterogeneity of
the measurement time or area at different locations. The default is 1 for all locations.
See Han and De Oliveira (2016) for more details.
The program computes two types of prediction intervals at a given confidence level. The shortest prediction interval is obtained from evaluating the highest to lowest prediction densities; the equal tail prediction interval has equal tail probabilities.
Value
A list of class "predgc" with the following elements:
obs.locs |
observed locations. |
obs.y |
observed values at observed locations. |
pred.locs |
prediction locations. |
predValue |
the expectation of the conditional predictive distribution. |
predCount |
predicted counts; the closest integer that |
predVar |
estimated variance of the prediction at prediction locations. |
ConfidenceLevel |
confidence level (between 0 to 1) if prediction interval is computed. |
predInterval.EqualTail |
equal-tail prediction interval. |
predInterval.Shortest |
shortest length prediction interval. |
Author(s)
Zifei Han hanzifei1@gmail.com
References
Han, Z. and De Oliveira, V. (2016) On the correlation structure of Gaussian copula models for geostatistical count data. Australian and New Zealand Journal of Statistics, 58:47-69.
Kazianka, H. and Pilz, J. (2010) Copula-based geostatistical modeling of continuous and discrete data including covariates. Stoch Environ Res Risk Assess 24:661-673.
Kazianka, H. (2013) Approximate copula-based estimation and prediction of discrete spatial data. Stoch Environ Res Risk Assess 27:2015-2026.
Masarotto, G. and Varin, C. (2012) Gaussian copula marginal regression. Electronic Journal of Statistics 6:1517-1549. https://projecteuclid.org/euclid.ejs/1346421603.
Masarotto, G. and Varin C. (2017). Gaussian Copula Regression in R. Journal of Statistical Software, 77(8), 1–26. doi: 10.18637/jss.v077.i08.
Han, Z. and De Oliveira, V. (2018) gcKrig: An R Package for the Analysis of Geostatistical Count Data Using Gaussian Copulas. Journal of Statistical Software, 87(13), 1–32. doi: 10.18637/jss.v087.i13.
See Also
Examples
## Not run:
## For fast check predict at four locations only
data(Weed95)
weedobs <- Weed95[Weed95$dummy==1, ]
weedpred <- Weed95[Weed95$dummy==0, ]
predweed1 <- predgc(obs.y = weedobs$weedcount, obs.x = weedobs[,4:5], obs.locs = weedobs[,1:2],
pred.x = weedpred[1:4,4:5], pred.locs = weedpred[1:4,1:2],
marginal = negbin.gc(link = 'log'), pred.interval = 0.9,
corr = matern.gc(kappa = 0.5, nugget = TRUE), method = 'GHK')
#summary(predweed1)
#plot(predweed1)
## Time consuming examples
## Weed prediction at 200 locations using parallel programming
predweed2 <- predgc(obs.y = weedobs$weedcount, obs.x = weedobs[,4:5], obs.locs = weedobs[,1:2],
pred.x = weedpred[,4:5], pred.locs = weedpred[,1:2],
marginal = negbin.gc(link = 'log'),
corr = matern.gc(kappa = 0.5, nugget = TRUE), method = 'GHK',
pred.interval = 0.95, parallel = TRUE,
paralleloptions = list(n.cores = 4))
#summary(predweed2)
#plot(predweed2)
## A more time consuming example for generating a prediction map at a fine grid
data(OilWell)
gridstep <- seq(0.5, 30.5, length = 40)
locOilpred <- data.frame(Easting = expand.grid(gridstep, gridstep)[,1],
Northing = expand.grid(gridstep, gridstep)[,2])
PredOil <- predgc(obs.y = OilWell[,3], obs.locs = OilWell[,1:2], pred.locs = locOilpred,
marginal = binomial.gc(link = 'logit'),
corr = matern.gc(nugget = FALSE), obs.effort = 1,
pred.effort = 1, method = 'GHK',
parallel = TRUE, paralleloptions = list(n.cores = 4))
PredMat <- summary(PredOil)
## To generate better prediction maps
library(colorspace)
filled.contour(seq(0.5,30.5,length=40), seq(0.5,30.5,length=40),
matrix(PredMat$predMean,40,), zlim = c(0, 1), col=rev(heat_hcl(12)),
nlevels=12, xlab = "Eastings", ylab = "Northings",
plot.axes = {axis(1); axis(2); points(OilWell[,1:2], col = 1,
cex = 0.25 + 0.25*OilWell[,3])})
filled.contour(seq(0.5,30.5,length=40), seq(0.5,30.5,length=40),
matrix(PredMat$predVar,40,),
zlim = c(0, 0.3), col = rev(heat_hcl(12)), nlevels = 10,
xlab = "Eastings", ylab = "Northings",
plot.axes = {axis(1); axis(2); points(OilWell[,1:2], col = 1,
cex = 0.25 + 0.25*OilWell[,3])})
## End(Not run)