| sk_cmean {snapKrig} | R Documentation |
Compute kriging predictor (or variance) for an sk grid
Description
Evaluates the kriging prediction equations to find the expected value (mean) of
the spatial process for g at all grid points, including unobserved ones.
Usage
sk_cmean(
g,
pars,
X = NA,
what = "p",
out = "s",
fac_method = "chol",
fac = NULL,
quiet = FALSE
)
Arguments
g |
an sk grid or list accepted by |
pars |
list of form returned by |
X |
sk grid, numeric, vector, matrix, or NA: the mean, or its linear predictors |
what |
character, what to compute: one of 'p' (predictor), 'v' (variance), or 'm' (more) |
out |
character, the return object, either 's' (sk grid) or 'v' (vector) |
fac_method |
character, either 'chol' or 'eigen' |
fac |
(optional) pre-computed factorization of covariance matrix scaled by partial sill |
quiet |
logical indicating to suppress console output |
Details
This predicts a noiseless version of the random process from which grid g was
sampled, conditional on the observed data, and possibly a set of covariates. It is
optimal in the sense of minimizing mean squared prediction error under the covariance
model specified by pars, and assuming the predictions are a linear combination of
the observed data.
The estimation method is determined by X. Set this to 0 and supply a de-trended
g to do simple kriging. Set it to NA to estimate a spatially uniform mean (ordinary
kriging). Or pass covariates to X, either as multi-layer sk grid or matrix, to do
universal kriging. See sk_GLS for details on specifying X in this case.
Set what='v' to return the point-wise kriging variance. This usually takes much
longer to evaluate than the prediction, but the computer memory demands are similar.
A progress bar will be printed to console in this case unless quiet=TRUE.
Technical notes
All calculations returned by sk_cmean are exact. Our implementation is based on the
variance decomposition suggested in section 3.4 (p. 153-155) of Cressie (1993), and uses a
loop over eigen-vectors (for observed locations) to compute variance iteratively.
In technical terms, sk_cmean estimates the mean of the signal process behind the data.
The nugget effect (eps) is therefore added to the diagonals of the covariance matrix for
the observed points, but NOT to the corresponding entries of the cross-covariance matrix.
This has the effect of smoothing (de-noising) predictions at observed locations, which means
sk_cmean is not an exact interpolator (except in the limit eps -> 0). Rather it makes
a correction to the observed data to make it consistent with the surrounding signal.
This is a good thing - real spatial datasets are almost always noisy, and we are typically interested in the signal, not some distorted version of it. For more on this see section 3 of Cressie (1993), and in particular the discussion in 3.2.1 on the nugget effect.
The covariance factorization fac can be pre-computed using sk_var with arguments
scaled=TRUE (and, if computing variance, fac_method='eigen'). This will speed up
subsequent calls where only the observed data values have changed (same covariance structure
pars, and same NA structure in the data). The kriging variance does not change
in this case and only needs to be computed once.
reference: "Statistics for Spatial Data" by Noel Cressie (1993)
Value
numeric matrix, the predicted values (or their variance)
See Also
sk sk_pars
Other estimators:
sk_GLS()
Other variance-related functions:
sk_GLS(),
sk_LL(),
sk_nLL(),
sk_sim(),
sk_var()
Examples
## set up very simple example problem
# make example grid and covariance parameters
g_all = sk_sim(10)
pars = sk_pars(g_all)
plot(g_all)
# remove most of the data
n = length(g_all)
p_miss = 0.90
is_miss = seq(n) %in% sample.int(n, round(p_miss*n))
is_obs = !is_miss
g_miss = g_all
g_miss[is_miss] = NA
plot(g_miss)
## simple kriging
# estimate the missing data conditional on what's left over
g_simple = sk_cmean(g_miss, pars, X=0)
plot(g_simple)
# variance of the estimator
g_simple_v = sk_cmean(g_miss, pars, X=0, what='v', quiet=TRUE)
plot(g_simple_v)
# get the same results with pre-computed variance
var_pc = sk_var(g_miss, pars, scaled=TRUE, fac_method='eigen')
g_simple_v_compare = sk_cmean(g_miss, pars, X=0, what='v', fac=var_pc, quiet=TRUE)
max(abs(g_simple_v_compare - g_simple_v))
## ordinary kriging
# estimate spatially uniform mean - true value is 0
sk_GLS(g_miss, pars, out='b')
# ordinary kriging automatically adjusts for the trend
g_ok = sk_cmean(g_miss, pars, X=NA)
# additional uncertainty in estimation means variance goes up a bit
g_ok_v = sk_cmean(g_miss, pars, X=NA, what='v', quiet=TRUE)
range(g_ok_v - g_simple_v)
## universal kriging
# introduce some covariates
n_betas = 3
betas = stats::rnorm(n_betas, 0, 10)
g_X = sk_sim(g_all, pars, n_layer=n_betas-1L)
g_lm_all = g_all + as.vector(cbind(1, g_X[]) %*% betas)
g_lm_miss = g_lm_all
g_lm_miss[is_miss] = NA
# prediction
g_uk = sk_cmean(g_lm_miss, pars, g_X)
g_uk_v = sk_cmean(g_lm_miss, pars, g_X, what='v', quiet=TRUE)
## repeat with special subgrid case (faster!)
# make g_all a subgrid of a larger example
g_super = sk_rescale(g_all, down=2)
# re-generate the covariates for the larger extent
g_X_super = sk_sim(g_super, pars, n_layer=n_betas-1L)
g_lm_super = g_super + as.vector(cbind(1, g_X_super[]) %*% betas)
# prediction
g_super_uk = sk_cmean(g_lm_super, pars, g_X_super)
g_super_uk_v = sk_cmean(g_lm_super, pars, g_X_super, what='v', quiet=TRUE)
## verification
# get observed variance and its inverse
V_full = sk_var(g_all, pars)
is_obs = !is.na(g_miss)
Vo = V_full[is_obs, is_obs]
Vo_inv = solve(Vo)
# get cross covariance
is_diag = as.logical(diag(nrow=length(g_all))[is_obs,])
Vc = V_full[is_obs,]
# nugget adjustment to diagonals (corrects the output at observed locations)
Vc[is_diag] = Vc[is_diag] - pars[['eps']]
# get covariates matrix and append intercept column
X = g_X[]
X_all = cbind(1, X)
X_obs = X_all[is_obs,]
# simple kriging the hard way
z = g_miss[is_obs]
z_trans = Vo_inv %*% z
z_pred_simple = c(t(Vc) %*% z_trans)
z_var_simple = pars[['psill']] - diag( t(Vc) %*% Vo_inv %*% Vc )
# ordinary kriging the hard way
x_trans = ( 1 - colSums( Vo_inv %*% Vc ) )
m_ok = x_trans / sum(Vo_inv)
z_pred_ok = c( (t(Vc) + m_ok) %*% z_trans )
z_var_ok = z_var_simple + diag( x_trans %*% t(m_ok) )
# universal kriging the hard way
z_lm = g_lm_miss[is_obs]
z_lm_trans = Vo_inv %*% z_lm
x_lm_trans = X_all - t( t(X_obs) %*% Vo_inv %*% Vc )
m_uk = x_lm_trans %*% solve(t(X_obs) %*% Vo_inv %*% X_obs)
z_pred_uk = c( (t(Vc) + t(X_obs %*% t(m_uk)) ) %*% z_lm_trans )
z_var_uk = z_var_simple + diag( x_lm_trans %*% t(m_uk) )
# check that these results agree with sk_cmean
max(abs(z_pred_simple - g_simple), na.rm=TRUE)
max(abs(z_var_simple - g_simple_v))
max(abs(z_pred_ok - g_ok), na.rm=TRUE)
max(abs(z_var_ok - g_ok_v))
max(abs(z_pred_uk - g_uk), na.rm=TRUE)
max(abs(z_var_uk - g_uk_v))
## repeat verification for sub-grid case
# rebuild matrices
V_full = sk_var(g_super_uk, pars)
is_obs = !is.na(g_super)
Vo = V_full[is_obs, is_obs]
Vo_inv = solve(Vo)
is_diag = as.logical(diag(nrow=length(g_super))[is_obs,])
Vc = V_full[is_obs,]
Vc[is_diag] = Vc[is_diag] - pars[['eps']]
X = g_X_super[]
X_all = cbind(1, X)
X_obs = X_all[is_obs,]
z = g_miss[is_obs]
# universal kriging the hard way
z_var_simple = pars[['psill']] - diag( t(Vc) %*% Vo_inv %*% Vc )
z_lm = g_lm_super[is_obs]
z_lm_trans = Vo_inv %*% z_lm
x_lm_trans = X_all - t( t(X_obs) %*% Vo_inv %*% Vc )
m_uk = x_lm_trans %*% solve(t(X_obs) %*% Vo_inv %*% X_obs)
z_pred_uk = c( (t(Vc) + t(X_obs %*% t(m_uk)) ) %*% z_lm_trans )
z_var_uk = z_var_simple + diag( x_lm_trans %*% t(m_uk) )
# verification
max(abs(z_pred_uk - g_super_uk), na.rm=TRUE)
max(abs(z_var_uk - g_super_uk_v))