variog.diagnostic.lm {PrevMap} | R Documentation |
Variogram-based validation for linear geostatistical model fits
Description
This function performs model validation for linear geostatistical model using Monte Carlo methods based on the variogram.
Usage
variog.diagnostic.lm(
object,
n.sim = 1000,
uvec = NULL,
plot.results = TRUE,
range.fact = 1,
which.test = "both",
param.uncertainty = FALSE
)
Arguments
object |
an object of class "PrevMap" obtained as an output from |
n.sim |
integer indicating the number of simulations used for the variogram-based diagnostics.
Defeault is |
uvec |
a vector with values used to define the variogram binning. If |
plot.results |
if |
range.fact |
a value between 0 and 1 used to disregard all distance bins provided through |
which.test |
a character specifying which test for residual spatial correlation is to be performed: "variogram", "test statistic" or "both". The default is |
param.uncertainty |
a logical indicating whether uncertainty in the model parameters should be incorporated in the selected diagnostic tests. Default is |
Details
The function takes as an input through the argument object
a fitted
linear geostaistical model for an outcome Y_i
, which is expressed as
Y_i=d_i'\beta+S(x_i)+Z_i
where d_i
is a vector of covariates which are specified through formula
, S(x_i)
is a spatial Gaussian process and the Z_i
are assumed to be zero-mean Gaussian.
The model validation is performed on the adopted satationary and isotropic Matern covariance function used for S(x_i)
.
More specifically, the function allows the users to select either of the following validation procedures.
Variogram-based graphical validation
This graphical diagnostic is performed by setting which.test="both"
or which.test="variogram"
. The output are 95
(see below lower.lim
and upper.lim
) that are generated under the assumption that the fitted model did generate the analysed data-set.
This validation procedure proceed through the following steps.
1. Obtain the mean, say \hat{Z}_i
, of the Z_i
conditioned on the data Y_i
.
2. Compute the empirical variogram using \hat{Z}_i
3. Simulate n.sim
data-sets under the fitted geostatistical model.
4. For each of the simulated data-sets and obtain \hat{Z}_i
as in Step 1.
Finally, compute the empirical variogram based on the resulting \hat{Z}_i
.
5. From the n.sim
variograms obtained in the previous step, compute the 95
If the observed variogram (obs.variogram
below), based on the \hat{Z}_i
from Step 2, falls within the 95
evidence against the fitted spatial correlation model; if, instead, that partly falls outside the 95
correlation in the data.
Test for suitability of the adopted correlation function
This diagnostic test is performed if which.test="both"
or which.test="test statistic"
. Let v_{E}(B)
and v_{T}(B)
denote the empirical and theoretical variograms based on \hat{Z}_i
for the distance bin B
.
The test statistic used for testing residual spatial correlation is
T = \sum_{B} N(B) \{v_{E}(B)-v_{T}(B)\}
where N(B)
is the number of pairs of data-points falling within the distance bin B
(n.bins
below).
To obtain the distribution of the test statistic T
under the null hypothesis that the fitted model did generate the analysed data-set, we use the simulated empirical variograms as obtained in step 5 of the iterative procedure described in "Variogram-based graphical validation."
The p-value for the test of suitability of the fitted spatial correlation function is then computed by taking the proportion of simulated values for T
that are larger than the value of T
based on the original \hat{Z}_i
in Step 1.
Value
An object of class "PrevMap.diagnostic" which is a list containing the following components:
obs.variogram
: a vector of length length(uvec)-1
containing the values of the variogram for each of
the distance bins defined through uvec
.
distance.bins
: a vector of length length(uvec)-1
containing the average distance within each of the distance bins
defined through uvec
.
n.bins
: a vector of length length(uvec)-1
containing the number of pairs of data-points falling within each distance bin.
lower.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the lower limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
upper.lim
: (available only if which.test="both"
or which.test="variogram"
) a vector of length length(uvec)-1
containing the upper limits of the 95
generated under the assumption of absence of suitability of the fitted model at each fo the distance bins defined through uvec
.
mode.rand.effects
: the predictive mode of the random effects from the fitted non-spatial generalized linear mixed model.
p.value
: (available only if which.test="both"
or which.test="test statistic"
) p-value of the test for residual spatial correlation.
lse.variogram
: (available only if lse.variogram=TRUE
) a vector of length length(uvec)-1
containing the values of the estimated Matern variogram via a weighted least square fit.