sk_GLS {snapKrig} | R Documentation |
Generalized least squares (GLS) with Kronecker covariances for sk grids
Description
Computes coefficients b of the linear model E(Z) = Xb using the GLS equation
for sk grid g
and covariance model pars
. By default the function returns the
linear predictor as an sk object
Usage
sk_GLS(g, pars, X = NA, out = "s", fac_method = "eigen", fac = NULL)
Arguments
g |
a sk grid object (or list with entries 'gdim', 'gres', 'gval') |
pars |
list of form returned by |
X |
sk grid, matrix or NA, the linear predictors (in columns) excluding intercept |
out |
character, either 'b' (coefficients), 'z' or 's' (Xb), or 'a' (all) |
fac_method |
character, factorization method: 'eigen' (default) or 'chol' (see |
fac |
matrix or list, (optional) pre-computed covariance matrix factorization |
Details
This is the maximum likelihood estimator for the linear trend Xb if we
assume the covariance parameters (in pars
) are specified correctly.
The GLS solution is: b = ( X^T V^-1 X )^-1 X^T V^-1 z,
where V is the covariance matrix for data vector z (which is g[!is.na(g)]
), and X
is a matrix of covariates. V is generated from the covariance model pars
with grid
layout g
.
Operations with V^-1 are computed using the factorization fac
(see sk_var
), or
else as specified in fac_method
.
Argument X
can be an sk grid (matching g
) with covariates in layers; or it can be
a matrix of covariates. DO NOT include an intercept layer (all 1's) in argument X
or
you will get collinearity errors. Matrix X
should have independent columns, and its
rows should match the order of g[]
or g[!is.na(g)]
.
Use X=NA
to specify an intercept-only model; ie to fit a spatially constant mean. This
replaces X in the GLS equation by a vector of 1's.
By default out='s'
returns the linear predictor in an sk grid object. Change this to
'z'
to return it as a vector, or 'b'
to get the GLS coefficients only. Set it to 'a'
to get the second two return types (in a list) along with matrix X
and its factorization.
The length of the vector output for out='z'
will match the number of rows in X
.
This means that if NA
grid points are excluded from X
, they will not appear in
the output (and vice versa). In the X=NA
case, the length is equal to the number of
non-NA
points in g
. Note that if a point is observed in g
, the function will expect
its covariates to be included X
(ie X
should have no NA
s corresponding to non-NA
points in g
).
If g[]
is a matrix (a multi-layer grid), the covariates in X
are recycled
for each layer. Layers are assumed mutually independent and the GLS equation is evaluated
using the corresponding block-diagonal V. This is equivalent to (but faster than) calling
sk_GLS
separately on each layer with the same X
and averaging the resulting b estimates.
Value
linear predictor Xb as an sk grid, or numeric vector, or coefficients (see details)
See Also
sk
Other estimators:
sk_cmean()
Other variance-related functions:
sk_LL()
,
sk_cmean()
,
sk_nLL()
,
sk_sim()
,
sk_var()
Examples
# set up example grid and covariance parameters
gdim = c(45, 31)
g_empty = sk(gdim)
n = length(g_empty)
pars = utils::modifyList(sk_pars(g_empty, 'gau'), list(psill=2))
# generate spatial noise
g_noise = sk_sim(g_empty, pars)
plot(g_noise)
# generate more spatial noise to use as covariates
n_betas = 3
betas = stats::rnorm(n_betas, 0, 10)
g_X = sk_sim(g_empty, pars, n_layer=n_betas-1L)
X = g_X[]
X_all = cbind(1, X)
g_lm = g_empty
g_lm[] = as.vector(X_all %*% betas)
plot(g_lm)
# combine with noise to make "observed" data
g_obs = g_lm + g_noise
plot(g_obs)
# By default (out='s') the function returns the linear predictor
g_lm_est = sk_GLS(g_obs, pars, g_X, out='s')
g_lm_est
plot(g_lm_est)
# equivalent, but slightly faster to get vector output
max(abs( sk_GLS(g_obs, pars, g_X, out='z') - g_lm_est[] ))
# repeat with matrix X
max(abs( sk_GLS(g_obs, pars, g_X[], out='z') - g_lm_est[] ))
# return the GLS coefficients
betas_est = sk_GLS(g_obs, pars, g_X, out='b')
print(betas_est)
print(betas)
# compute trend manually as product of betas with X and intercept
lm_est = X_all %*% betas_est
max( abs(lm_est - g_lm_est[] ) )
# de-trend observations by subtracting linear predictor
plot(g_obs - g_lm_est)
# repeat with pre-computed eigen factorization (same result but faster)
fac_eigen = sk_var(g_obs, pars, fac_method='eigen', sep=TRUE)
betas_est_compare = sk_GLS(g_obs, pars, g_X, fac=fac_eigen, out='b')
max( abs( betas_est_compare - betas_est ) )
# missing data example
n_obs = 10
g_miss = g_obs
idx_miss = sort(sample.int(n, n-n_obs))
g_miss[idx_miss] = NA
is_obs = !is.na(g_miss)
plot(g_miss)
# coefficient estimates are still unbiased but less precise
betas_est = sk_GLS(g_miss, pars, g_X, out='b')
print(betas_est)
print(betas)
# set X to NA to estimate the spatially constant trend
b0 = sk_GLS(g_miss, pars, X=NA, out='b')
# matrix X does not need to include unobserved points, but output is filled to match X
X_obs = X[is_obs,]
sk_GLS(g_miss, pars, X=X_obs)
sk_GLS(g_miss, pars, X=X)
# verify GLS results manually
X_all_obs = cbind(1, X_obs)
V = sk_var(g_miss, pars)
z = g_miss[!is.na(g_miss)]
X_trans = t(X_all_obs) %*% solve(V)
betas_compare = solve( X_trans %*% X_all_obs ) %*% X_trans %*% z
betas_compare - betas_est
# multi-layer examples skipped to stay below 5s exec time on slower machines
# generate some extra noise for 10-layer example
g_noise_multi = sk_sim(g_empty, pars, n_layer=10)
g_multi = g_lm + g_noise_multi
betas_complete = sk_GLS(g_multi, pars, g_X, out='b')
print(betas_complete)
print(betas)
# multi-layer input shares covariates matrix X, and output is to a single layer
summary(sk_GLS(g_multi, pars, g_X))
summary(sk_GLS(g_multi, pars, X))
# note that X cannot be missing data where `g` is observed
# repeat with missing data
g_multi[!is_obs,] = NA
g_X_obs = g_X
g_X_obs[!is_obs,] = NA
betas_sparse = sk_GLS(g_multi, pars, X, out='b')
print(betas_sparse)
print(betas)
summary(sk_GLS(g_multi, pars, g_X))
summary(sk_GLS(g_multi, pars, X))
summary(sk_GLS(g_multi, pars, g_X_obs))
summary(sk_GLS(g_multi, pars, X_obs))