georobSimulation {georob} | R Documentation |
Simulating Realizations of Gaussian Processes
Description
This page documents the function condsim
that
simulates (un)conditional realizations of Gaussian processes from the
parameters of a spatial linear model estimated by the function
georob
.
Usage
condsim(object, newdata, nsim, seed, type = c("response", "signal"),
locations, trend.coef = NULL,
variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL,
control = control.condsim(), verbose = 0)
control.condsim(use.grid = FALSE, grid.refinement = 2.,
condsim = TRUE, ce.method = c( "standard", "approximate" ),
ce.grid.expansion = 1., include.data.sites = FALSE,
means = FALSE, trend.covariates = FALSE, covariances = FALSE,
ncores = 1, mmax = 10000, pcmp = control.pcmp())
Arguments
object |
an object of class |
newdata |
a mandatory data frame,
|
nsim |
a positive interger with the number of (condititional) realizations to compute (mandatory argument). |
seed |
an integer seed to initialize random number generation,
see |
type |
a character keyword defining what target quantity should be simulated. Possible values are
see |
locations |
an optional one-sided formula specifying what variables
of |
trend.coef |
an optional numeric vector with the coefficients of the trend model to be used for computing the (conditional) mean function of the random process see Details. |
variogram.model |
an optional character keyword defining the
variogram model to be used for the simulations, see |
param |
an optional named numeric vector with values of the
variogram parameters used for the simulations, see |
aniso |
an optional named numeric vector with values of anisotropy
parameters of a variogram used for the simulations, see
|
variogram.object |
an optional list that defines a possibly nested
variogram model used for the simulations, see |
control |
a list with the components |
verbose |
a positive integer controlling logging of diagnostic
messages to the console. |
use.grid |
a logical scalar (default |
grid.refinement |
a numeric that defines a factor by which the
minimum differences of the coordinates between any pair of points in
|
condsim |
a logical scalar to control whether conditional
( |
ce.method |
a character keyword to select the method to simulate realizations by the circulant embedding algorithm, see Details. |
ce.grid.expansion |
a numeric with the factor by which the
dimensions of the simulation grid is expanded in the circulant embedding
algorithm. Should be |
include.data.sites |
a logical scalar, to control whether
(conditionally) simulated values are computed also for the points of the
original data set used to estimate the model parameters and contained in
|
means |
a logical scalar, to control whether the (un)conditional means are included in the output. |
trend.covariates |
a logical scalar, to control whether the covariates required for the trend model are included in the output. |
covariances |
a logical scalar, to control whether the covariances
between the points of the original data set used to estimate the model
parameters ( |
ncores |
a positive integer controlling how many cores are used for parallelized computations, defaults to 1. |
mmax |
a positive integer equal to the maximum number (default
|
pcmp |
a list of arguments, passed e.g. to |
Details
General
condsim
(conditionally) simulates from a Gaussian process that
has a linear mean function with parameters
and an auto-correlation structure
characterized by a parametric variogram model and variogram parameters
and
(see
georobPackage
for the employed parametrization of the
spatial linear model). The parameters of the mean and auto-correlation
function are either taken from the spatial linear model estimated by
georob
and passed by the argument
object
to condsim
or from the optional arguments
trend.coef
()
and
variogram.model
, param
, aniso
or
variogram.object
(,
).
Simulated values are computed for the points in newdata
and
optionally also for the data points in object
if
include.data.sites = TRUE
. Both unconditional and conditional
simulations can be computed. In the latter cases, the simulated values
are always conditioned to the response data used to fit the spatial
linear model by georob
and contained in object
.
Unconditional simulation
Unconditional realizations are either computed for the exact locations of
the points in newdata
(use.grid = FALSE
), irrespective of
the fact whether these are arranged on a regular grid or not.
Simulations are then generated by the Cholesky matrix decomposition
method (e.g. Chilès and Delfiner, 1999, sec.
7.2.2).
For use.grid = TRUE
the points in newdata
are matched to a
rectangular simulation grid and the simulations are generated for all
nodes of this grid by the circulant embedding method (Davis and
Bryant, 2013; Dietrich and Newsam, 1993; Wood and Chan,
1994). For large problems this approach may be substantially faster and
less memory demanding than the Cholesky matrix decomposition method.
For circulant embedding, first a rectangular simulation grid is
constructed from the coordinates of the points in newdata
and
object
. The spacings of the simulation grid is equal to the
minimum coordinate differences between any pair of points in
newdata
, divided by grid.refinement
. The spatial extent of
the simulation grid is chosen such that it covers the bounding boxes of
all points in newdata
and object
. The points in
newdata
and object
are then matched to the closest nodes of
the simulation grid. If the same node is assigned to a point in
object
and newdata
then the point in object
is kept
and the concerned point in newdata
is omitted.
The rectangular simulation grid is then expanded to the larger circulant embedding grid, and the eigenvalues of the so-called base matrix (= first row of the covariance matrix of the nodes of the circulant embedding grid with block circulant structure, see Davies and Bryant, 2013) are computed by fast discrete Fourier transform (FFT). It may happen that some of the eigenvalues of the base matrix are negative. The standard circulant embedding algorithm then fails.
Two approaches are implemented in condsim
to handle this
situation:
First, one may use the approximate circulant embedding method by choosing
ce.method = "approximate"
. This sets the negative eigenvalues of the base matrix to zero and scales the eigenvalues, see Chan and Wood (1994, sec. 4, choice).
Second, one may attempt to avoid the problem of negative eigenvalues by increasing the size of the simulation (and circulant embedding) grids. This can be achieved by choosing a value
for the argument
ce.grid.expansion
, see respective parts in Dietrich and Newsam (1993, sec. 4) and Wood and Chan (1994, sec. 3).
Note that the dimension of the simulation and embedding grids are chosen such that the number of nodes is a highly composite integer. This allows efficient FFT.
Conditional simulation
For both the Cholesky matrix decomposition and the circulant embedding approach, simulations are conditioned to data by the Kriging method, see Chilès and Delfiner, 1999, sec. 7.3.
Parallelized computations
condsim
uses the packages parallel and snowfall for
parallelized computations. Three tasks can be executed in parallel:
Computation of (generalized correlations), see
control.pcmp
how to do this.Computation of Kriging predictions required for conditional simulations, see section Details of
predict.georob
.Fast Fourier transform of realizations of standard normal deviates generated for the nodes of the base matrix (see Davies and Bryant, 2013, steps 3–5 of algorithm). If there are
nsim
realizations to simulate, the task is split intoceiling(nsim / ncores)
sub-tasks that are then distributed toncores
CPUs. Evidently,ncores = 1
(default) suppresses parallel execution.
Value
The output generated by condsim
is an object of a “similar”
class as newdata
(data frame,
SpatialPointsDataFrame
,
SpatialPixelsDataFrame
,
SpatialGridDataFrame
,
SpatialPolygonsDataFrame
).
The data frame or the
data
slot of the Spatial...DataFrame
objects
have the following components:
the coordinates of the prediction points (only present if
newdata
is a data frame).-
expct
: optionally the (un)conditional means. optionally the covariates required for the trend model.
-
sim.1
,sim.2
, ...: the (un)conditionally simulated realizations.
The function control.condsim
returns a list with parameters to
steer condsim
, see arguments above.
Author(s)
Andreas Papritz papritz@retired.ethz.ch.
References
Chilès, J.-P., Delfiner, P. (1999) Geostatistics Modeling Spatial Uncertainty, Wiley, New York, doi:10.1002/9780470316993.
Davies, T. M., Bryant, D. (2013) On circulant embedding for gaussian random fields in R, Journal of Statistical Software, 55, 1–21, doi:10.18637/jss.v055.i09.
Dietrich, C. R., Newsam, G. N. (1993) A fast and exact method for multidimensional gaussian stochastic simulations, Water Resources Research, 9, 2861–2869, doi:10.1029/93WR01070.
Wood, A. T. A., Chan, G. (1994) Simulation of stationary gaussian
processes in , Journal of Computational and Graphcal
Statistics, 3, 409–432, doi:10.2307/1390903.
See Also
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
Examples
data(meuse)
data(meuse.grid)
## convert to SpatialGridDataFrame
meuse.grid.sgdf <- meuse.grid
coordinates(meuse.grid.sgdf) <- ~ x + y
gridded(meuse.grid.sgdf) <- TRUE
fullgrid(meuse.grid.sgdf) <- TRUE
## Gaussian REML fit
r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse,
locations = ~ x + y, variogram.model = "RMexp",
param = c(variance = 0.15, nugget = 0.05, scale = 200),
tuning.psi = 1000)
## Unconditional simulations using circulant embedding on rectangular
## simulation grid
r.sim.1 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1,
control = control.condsim(use.grid = TRUE, condsim = FALSE))
spplot(r.sim.1, zcol = "sim.1", at = seq(3.5, 8.5, by = 0.5))
## Conditional simulations using circulant embedding
if(interactive()){
## example is run only in interactive session because cpu times exceeds 5 s
r.sim.2 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1,
control = control.condsim(use.grid = FALSE, condsim = TRUE))
spplot(r.sim.2, zcol = "sim.2", at = seq(3.5, 8.5, by = 0.5))
}