| georob {georob} | R Documentation | 
Robust Fitting of Spatial Linear Models
Description
The function georob fits a linear model with spatially correlated
errors to geostatistical data that are possibly contaminated by
independent outliers.  The regression coefficients and the parameters of
the variogram model are estimated by robust or Gaussian restricted
maximum likelihood (REML) or by Gaussian maximum likelihood (ML).
Usage
georob(formula, data, subset, weights, na.action, model = TRUE,
    x = FALSE, y = FALSE, contrasts = NULL, offset, locations,
    variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy",
        "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian",
        "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting",
        "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp",
        "RMspheric", "RMstable", "RMwave", "RMwhittle"),
    param, fit.param = default.fit.param()[names(param)],
	  aniso = default.aniso(), fit.aniso = default.fit.aniso(),
    variogram.object = NULL,
    tuning.psi = 2, control = control.georob(),
    verbose = 0, ...)
Arguments
formula | 
 a symbolic description of the regression model for the
external drift to be fit (mandatory argument).  See
  | 
data | 
 an optional data frame, a
  | 
subset | 
 an optional vector specifying a subset of observations to be used in the fitting process.  | 
weights | 
 an optional vector of weights to be used in the fitting process, currently ignored.  | 
na.action | 
 a function which indicates what should happen when the
data contain   | 
model, x, y | 
 logical scalars.  If   | 
contrasts | 
 an optional list.  See the   | 
offset | 
 this optional argument can be used to specify an a
priori known component to be included in the linear predictor during
fitting.  An   | 
locations | 
 a one-sided formula defining the variables that are used as coordinates of the locations were the data was recorded (mandatory argument).  | 
variogram.model | 
 a character keyword defining the variogram model
to be fitted.  Currently, most basic variogram models provided formerly
by the now archived package RandomFields can be fitted (see
Details and   | 
param | 
 a named numeric vector with initial values of the
variogram parameters (mandatory argument).  The names of  
  | 
fit.param | 
 a named logical vector (or a function such as
  | 
aniso | 
 a named numeric vector with initial values (or a function such as
 
  | 
fit.aniso | 
 a named logical vector (or a function such as
  | 
variogram.object | 
 an optional list that defines a possibly nested variogram model. Each component is itself a list with the following components: 
 Note that the arguments   | 
tuning.psi | 
 positive numeric.  The tuning constant   | 
control | 
 a list specifying parameters that control the behaviour of
  | 
verbose | 
 positive integer controlling logging of diagnostic
messages to the console during model fitting.    | 
... | 
 further arguments passed to function (e.g.   | 
Details
georob fits a spatial linear model by robust (Künsch et al., 2011, Künsch
et al., in preparation) or Gaussian RE(ML) (Harville, 1977).
georobPackage describes the employed model and briefly
sketches the robust REML estimation and the robust external drift Kriging
method.  Here, we describe further details of georob.
Implemented variograms
Currently, most basic variogram models provided formerly by the now
archived package RandomFields can be fitted by georob (see
argument variogram.model and gencorr for a list of
implemented models).  Some of these models have in addition to
variance, snugget, nugget and scale further
parameters.  Initial values of these parameters (param) and
fitting flags (fit.param) must be passed to georob by the
same names as used for the models RM... in
gencorr.  Use the function param.names to
list additional parameters of a given variogram.model.
The arguments fit.param and fit.aniso are used to control
what variogram and anisotropy parameters are estimated and which are
kept at the constant initial values.  The functions
default.fit.param and default.fit.aniso set
reasonable default values for these arguments.  Note, as an aside, that
the function default.aniso sets (default) values of the
anisotropy parameters for an isotropic variogram.
Estimating parameters of power function variogram
The intrinsic variogram model RMfbm is over-parametrized when
both the variance (plus possibly snugget) and the
scale are estimated.  Therefore, to estimate the parameters of
this model, scale must be kept fixed at an arbitrary value by
using fit.param["scale"] = FALSE.
Estimating parameters of geometrically anisotropic variograms
The subsection Model of georobPackage describes
how such models are parametrized and gives definitions the various
elements of aniso.  Some additional remarks might be helpful:
The first semi-principal axis points into the direction with the farthest reaching auto-correlation, which is described by the range parameter
scale(\alpha).The ranges in the direction of the second and third semi-principal axes are given by
f_1\alphaandf_2 \alpha, with0 < f_2 \leq f_1 \leq 1.The default values for
aniso(f_1=1,f_2=1) define an isotropic variogram model.Valid ranges for the angles characterizing the orientation of the semi-variance ellipsoid are (in degrees):
\omega[0, 180],\phi[0, 180],\zeta[-90, 90].
Estimating variance of micro-scale variation
Simultaneous estimation of the variance of the micro-scale variation
(snugget, \sigma_\mathrm{n}^2), appears seemingly
as spatially uncorrelated with a given sampling design, and of the
variance (nugget, \tau^2) of the independent errors
requires that for some locations
\boldsymbol{s}_i replicated observations are
available.  Locations less or equal than zero.dist apart are
thereby considered as being coincident (see
control.georob).
Constraining estimates of variogram parameters
Parameters of variogram models can vary only within certain bounds (see
param.bounds and gencorr
for allowed ranges).  georob uses three mechanisms to constrain
parameter estimates to permissible ranges:
-  
Parameter transformations: By default, all variance (
variance,snugget,nugget), the rangescale, the anisotropy parametersf1andf2and many of the additional parameters are log-transformed before solving the estimating equations or maximizing the restricted log-likelihood and this warrants that the estimates are always positive (seecontrol.georobfor detailed explanations how to control parameter transformations). -  
Checking permissible ranges: The additional parameters of the variogram models such as the smoothness parameter
\nuof the Whittle-Matérn model are forced to stay in the permissible ranges by signalling an error tonleqslv,nlminboroptimif the current trial values are invalid. These functions then graciously update the trial values of the parameters and carry their task on. However, it is clear that such a procedure likely gets stuck at a point on the boundary of the parameter space and is therefore just a workaround for avoiding runtime errors due to invalid parameter values. -  
Exploiting the functionality of
nlminbandoptim: If a spatial model is fitted non-robustly, then the argumentslower,upper(andmethodofoptim) can be used to constrain the parameters (seecontrol.optimhow to pass them tooptim). Foroptimone has to use the argumentsmethod = "L-BFGS-B",lower = l,upper = u, where l and u are numeric vectors with the lower and upper bounds of the transformed parameters in the order as they appear in
c(variance, snugget, nugget, scale, ...)[fit.param], aniso[fit.aniso]),
where...are additional parameters of isotropic variogram models (use
param.names(variogram.model)to display the names and the order of the additional parameters forvariogram.model). Fornlminbone has to use the argumentslower = l,upper = u, where l and u are numeric vectors as described above. 
Computing robust initial estimates of parameters for robust REML
To solve the robustified estimating equations for
\boldsymbol{B} and
\boldsymbol{\beta} the following initial
estimates are used:
-  
\widehat{\boldsymbol{B}}= \boldsymbol{0},if this turns out to be infeasible, initial values can be passed togeorobby the argumentbhatofcontrol.georob. -  
\widehat{\boldsymbol{\beta}}is either estimated robustly by the functionlmrob,rqor non-robustly bylm(see argumentinitial.fixefofcontrol.georob). 
Finding the roots of the robustified estimating equations of the
variogram and anisotropy parameters is more sensitive to a good choice
of initial values than maximizing the Gaussian (restricted)
log-likelihood with respect to the same parameters.  If the initial
values for param and aniso are not sufficiently close to
the roots of the system of nonlinear equations, then
nleqslv may fail to find them.
Setting initial.param = TRUE (see control.georob)
allows one to find initial values that are
often sufficiently close to the roots so that
nleqslv converges.  This is achieved by:
Initial values of the regression parameters are computed by
lmrobirrespective of the choice forinitial.fixef(seecontrol.georob).Observations with “robustness weights” of the
lmrobfit, satisfying
\psi_c(\widehat{\varepsilon}_i/\widehat{\tau})/(\widehat{\varepsilon}_i/\widehat{\tau}) \leq \mbox{\code{min.rweight}}, are discarded (seecontrol.georob).The model is fit to the pruned data set by Gaussian REML using
nlminboroptim.The resulting estimates of the variogram parameters (
param,aniso) are used as initial estimates for the subsequent robust fit of the model bynleqslv.
Note that for step 3 above, initial values of param and
aniso must be provided to georob.
Estimating variance parameters by Gaussian (RE)ML
Unlike robust REML, where robustified estimating equations are solved
for the variance parameters nugget (\tau^2),
variance (\sigma^2), and possibly snugget
(\sigma_{\mathrm{n}}^2), for Gaussian (RE)ML the
variances can be re-parametrized to
the signal variance
\sigma_B^2 = \sigma^2 + \sigma_{\mathrm{n}}^2,the inverse relative nugget
\eta = \sigma_B^2 / \tau^2andthe relative auto-correlated signal variance
\xi = \sigma^2/\sigma_B^2.
georob maximizes then a (restricted) profile
log-likelihood that depends only on \eta, \xi,
\alpha, ..., and \sigma_B^2 is estimated by an explicit
expression that depends on these parameters (e.g. Diggle and
Ribeiro, 2006, p.  113).  This is usually more efficient than
maximizing the (restricted) log-likelihood with respect to the original
variance parameters \tau^2,
\sigma_{\mathrm{n}}^2 and \sigma^2.
georob chooses the parametrization automatically, but the user
can control it by the argument reparam of the function
control.georob.
Value
An object of class georob representing a robust (or Gaussian) (RE)ML
fit of a spatial linear model.  See
georobObject for the components of the fit.
Author(s)
Andreas Papritz papritz@retired.ethz.ch
with contributions by Cornelia Schwierz.
References
Diggle, P. J. and Ribeiro, P. J. R. (2006) Model-based Geostatistics. Springer, New York, doi:10.1007/978-0-387-48536-2.
Harville, D. A. (1977) Maximum likelihood approaches to variance component estimation and to related problems, Journal of the American Statistical Association, 72, 320–340, doi:10.1080/01621459.1977.10480998.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in preparation) Robust Geostatistics.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. doi:10.3929/ethz-a-009900710
See Also
georobPackage for a description of the model and a brief summary of the algorithms;
georobObject for a description of the class georob;
profilelogLik for computing profiles of Gaussian likelihoods;
plot.georob for display of RE(ML) variogram estimates;
control.georob for controlling the behaviour of georob;
georobModelBuilding for stepwise building models of class georob;
cv.georob for assessing the goodness of a fit by georob;
georobMethods for further methods for the class georob;
predict.georob for computing robust Kriging predictions;
lgnpp for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation for simulating realizations of a Gaussian process
from model fitted by georob; and finally
sample.variogram and fit.variogram.model
for robust estimation and modelling of sample variograms.
Examples
################
## meuse data ##
################
data(meuse)
## Gaussian REML fit
r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
    variogram.model = "RMexp",
    param = c(variance = 0.15, nugget = 0.05, scale = 200),
    tuning.psi = 1000)
summary(r.logzn.reml, correlation = TRUE)
plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100))
## robust REML fit
if(interactive()){
  ## example is run only in interactive session because cpu times exceeds 5 s
  r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1)
  summary(r.logzn.rob, correlation = TRUE)
  lines(r.logzn.rob, col = "red")
}
###################
## wolfcamp data ##
###################
data(wolfcamp)
## fitting isotropic IRF(0) model
r.irf0.iso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y,
    variogram.model = "RMfbm",
    param = c(variance = 10, nugget = 1500, scale = 1, alpha = 1.5),
    fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
    tuning.psi = 1000)
summary(r.irf0.iso)
plot(r.irf0.iso, lag.dist.def = seq(0, 200, by = 7.5))
## fitting anisotropic IRF(0) model
if(interactive()){
  ## example is run only in interactive session because cpu times exceeds 5 s
  r.irf0.aniso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y,
      variogram.model = "RMfbm",
      param = c(variance = 5.9, nugget = 1450, scale = 1, alpha = 1),
      fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
      aniso = default.aniso(f1 = 0.51, omega = 148.),
      fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE),
      tuning.psi = 1000)
  summary(r.irf0.aniso)
  plot(r.irf0.aniso, lag.dist.def = seq(0, 200, by = 7.5),
      xy.angle.def = c(0, 22.5, 67.5, 112.5, 157.5, 180.),
      add = TRUE, col = 2:5)
  pchisq(2*(r.irf0.aniso[["loglik"]] - r.irf0.iso[["loglik"]]), 2, lower = FALSE)
}