| sk_GLS {snapKrig} | R Documentation |
Generalized least squares (GLS) with Kronecker covariances for sk grids
Description
Computes coefficients b of the linear model E(Z) = Xb using the GLS equation
for sk grid g and covariance model pars. By default the function returns the
linear predictor as an sk object
Usage
sk_GLS(g, pars, X = NA, out = "s", fac_method = "eigen", fac = NULL)
Arguments
g |
a sk grid object (or list with entries 'gdim', 'gres', 'gval') |
pars |
list of form returned by |
X |
sk grid, matrix or NA, the linear predictors (in columns) excluding intercept |
out |
character, either 'b' (coefficients), 'z' or 's' (Xb), or 'a' (all) |
fac_method |
character, factorization method: 'eigen' (default) or 'chol' (see |
fac |
matrix or list, (optional) pre-computed covariance matrix factorization |
Details
This is the maximum likelihood estimator for the linear trend Xb if we
assume the covariance parameters (in pars) are specified correctly.
The GLS solution is: b = ( X^T V^-1 X )^-1 X^T V^-1 z,
where V is the covariance matrix for data vector z (which is g[!is.na(g)]), and X
is a matrix of covariates. V is generated from the covariance model pars with grid
layout g.
Operations with V^-1 are computed using the factorization fac (see sk_var), or
else as specified in fac_method.
Argument X can be an sk grid (matching g) with covariates in layers; or it can be
a matrix of covariates. DO NOT include an intercept layer (all 1's) in argument X or
you will get collinearity errors. Matrix X should have independent columns, and its
rows should match the order of g[] or g[!is.na(g)].
Use X=NA to specify an intercept-only model; ie to fit a spatially constant mean. This
replaces X in the GLS equation by a vector of 1's.
By default out='s' returns the linear predictor in an sk grid object. Change this to
'z' to return it as a vector, or 'b' to get the GLS coefficients only. Set it to 'a'
to get the second two return types (in a list) along with matrix X and its factorization.
The length of the vector output for out='z' will match the number of rows in X.
This means that if NA grid points are excluded from X, they will not appear in
the output (and vice versa). In the X=NA case, the length is equal to the number of
non-NA points in g. Note that if a point is observed in g, the function will expect
its covariates to be included X (ie X should have no NAs corresponding to non-NA
points in g).
If g[] is a matrix (a multi-layer grid), the covariates in X are recycled
for each layer. Layers are assumed mutually independent and the GLS equation is evaluated
using the corresponding block-diagonal V. This is equivalent to (but faster than) calling
sk_GLS separately on each layer with the same X and averaging the resulting b estimates.
Value
linear predictor Xb as an sk grid, or numeric vector, or coefficients (see details)
See Also
sk
Other estimators:
sk_cmean()
Other variance-related functions:
sk_LL(),
sk_cmean(),
sk_nLL(),
sk_sim(),
sk_var()
Examples
# set up example grid and covariance parameters
gdim = c(45, 31)
g_empty = sk(gdim)
n = length(g_empty)
pars = utils::modifyList(sk_pars(g_empty, 'gau'), list(psill=2))
# generate spatial noise
g_noise = sk_sim(g_empty, pars)
plot(g_noise)
# generate more spatial noise to use as covariates
n_betas = 3
betas = stats::rnorm(n_betas, 0, 10)
g_X = sk_sim(g_empty, pars, n_layer=n_betas-1L)
X = g_X[]
X_all = cbind(1, X)
g_lm = g_empty
g_lm[] = as.vector(X_all %*% betas)
plot(g_lm)
# combine with noise to make "observed" data
g_obs = g_lm + g_noise
plot(g_obs)
# By default (out='s') the function returns the linear predictor
g_lm_est = sk_GLS(g_obs, pars, g_X, out='s')
g_lm_est
plot(g_lm_est)
# equivalent, but slightly faster to get vector output
max(abs( sk_GLS(g_obs, pars, g_X, out='z') - g_lm_est[] ))
# repeat with matrix X
max(abs( sk_GLS(g_obs, pars, g_X[], out='z') - g_lm_est[] ))
# return the GLS coefficients
betas_est = sk_GLS(g_obs, pars, g_X, out='b')
print(betas_est)
print(betas)
# compute trend manually as product of betas with X and intercept
lm_est = X_all %*% betas_est
max( abs(lm_est - g_lm_est[] ) )
# de-trend observations by subtracting linear predictor
plot(g_obs - g_lm_est)
# repeat with pre-computed eigen factorization (same result but faster)
fac_eigen = sk_var(g_obs, pars, fac_method='eigen', sep=TRUE)
betas_est_compare = sk_GLS(g_obs, pars, g_X, fac=fac_eigen, out='b')
max( abs( betas_est_compare - betas_est ) )
# missing data example
n_obs = 10
g_miss = g_obs
idx_miss = sort(sample.int(n, n-n_obs))
g_miss[idx_miss] = NA
is_obs = !is.na(g_miss)
plot(g_miss)
# coefficient estimates are still unbiased but less precise
betas_est = sk_GLS(g_miss, pars, g_X, out='b')
print(betas_est)
print(betas)
# set X to NA to estimate the spatially constant trend
b0 = sk_GLS(g_miss, pars, X=NA, out='b')
# matrix X does not need to include unobserved points, but output is filled to match X
X_obs = X[is_obs,]
sk_GLS(g_miss, pars, X=X_obs)
sk_GLS(g_miss, pars, X=X)
# verify GLS results manually
X_all_obs = cbind(1, X_obs)
V = sk_var(g_miss, pars)
z = g_miss[!is.na(g_miss)]
X_trans = t(X_all_obs) %*% solve(V)
betas_compare = solve( X_trans %*% X_all_obs ) %*% X_trans %*% z
betas_compare - betas_est
# multi-layer examples skipped to stay below 5s exec time on slower machines
# generate some extra noise for 10-layer example
g_noise_multi = sk_sim(g_empty, pars, n_layer=10)
g_multi = g_lm + g_noise_multi
betas_complete = sk_GLS(g_multi, pars, g_X, out='b')
print(betas_complete)
print(betas)
# multi-layer input shares covariates matrix X, and output is to a single layer
summary(sk_GLS(g_multi, pars, g_X))
summary(sk_GLS(g_multi, pars, X))
# note that X cannot be missing data where `g` is observed
# repeat with missing data
g_multi[!is_obs,] = NA
g_X_obs = g_X
g_X_obs[!is_obs,] = NA
betas_sparse = sk_GLS(g_multi, pars, X, out='b')
print(betas_sparse)
print(betas)
summary(sk_GLS(g_multi, pars, g_X))
summary(sk_GLS(g_multi, pars, X))
summary(sk_GLS(g_multi, pars, g_X_obs))
summary(sk_GLS(g_multi, pars, X_obs))