R: Spatial Blind Source Separation

sbss {SpatialBSS}

R Documentation

Spatial Blind Source Separation

Description

sbss estimates the unmixing matrix assuming a spatial blind source separation model by simultaneous/jointly diagonalizing the covariance matrix and one/many local covariance matrices. These local covariance matrices are determined by spatial kernel functions. Three types of such kernel functions are supported.

Usage

sbss(x, ...)

## Default S3 method:
sbss(x, coords, kernel_type = c('ring', 'ball', 'gauss'), 
     kernel_parameters, lcov = c('lcov', 'ldiff', 'lcov_norm'), ordered = TRUE, 
     kernel_list = NULL, rob_whitening = FALSE, ...)
## S3 method for class 'SpatialPointsDataFrame'
sbss(x, ...)
## S3 method for class 'sf'
sbss(x, ...)

Arguments

`x`	either a numeric matrix of dimension `c(n, p)` where the `p` columns correspond to the entries of the random field and the `n` rows are the observations, an object of class `SpatialPointsDataFrame` or an object of class `sf`.
`coords`	a numeric matrix of dimension `c(n,2)` where each row represents the coordinates of a point in the spatial domain. Only needed if `x` is a matrix and the argument `kernel_list` is `NULL`.
`kernel_type`	a string indicating which kernel function to use. Either `'ring'` (default), `'ball'` or `'gauss'`.
`kernel_parameters`	a numeric vector that gives the parameters for the kernel function. At least length of one for `'ball'` and `'gauss'` or two for `'ring'` kernel, see details.
`lcov`	a string indicating which type of local covariance matrix to use. Either `'lcov'` (default), `'ldiff'` or `'lcov_norm'`. See `sbss_asymp` for details on the latter option.
`ordered`	logical. If `TRUE` the entries of the latent field are ordered by the sum of squared (pseudo-)eigenvalues of the diagonalized local covariance matrix/matrices. Default is `TRUE`.
`kernel_list`	a list of spatial kernel matrices with dimension `c(n,n)`, see details. Usually computed by the function `spatial_kernel_matrix`.
`rob_whitening`	logical. If `TRUE` whitening is carried out with respect to the first spatial scatter matrix and not the sample covariance matrix, see details. Default is `FALSE`.
`...`	further arguments for the fast real joint diagonalization algorithm that jointly diagonalizes the local covariance matrices. See details and `frjd`.

Details

Three versions of local covariance matrices are implemented, the argument lcov determines which version is used:

'lcov':

LCov(f) = 1/n \sum_{i,j} f(d_{i,j}) (x(s_i)-\bar{x}) (x(s_j)-\bar{x})' ,
'ldiff':

LDiff(f) = 1/n \sum_{i,j} f(d_{i,j}) (x(s_i)-x(s_j)) (x(s_i)-x(s_j))',
'lcov_norm':

LCov^*(f) = 1/(n F^{1/2}_{f,n}) \sum_{i,j} f(d_{i,j}) (x(s_i)-\bar{x}) (x(s_j)-\bar{x})',

with

F_{f,n} = 1 / n \sum_{i,j} f^2(d_{i,j}).

Where d_{i,j} \ge 0 correspond to the pairwise distances between coordinates, x(s_i) are the p random field values at location s_i, \bar{x} is the sample mean vector, and the kernel function f(d) determines the locality. The choice 'lcov_norm' is useful when testing for the actual signal dimension of the latent field, see sbss_asymp and sbss_boot. LDiff matrices are supposed to be more robust when the random field shows a smooth trend. The following kernel functions are implemented and chosen with the argument kernel_type:

'ring': parameters are inner radius r_{in} and outer radius r_{out}, with r_{in} < r_{out}, and r_{in}, r_{out} \ge 0:

f(d;r_{in}, r_{out}) = I(r_{in} < d \le r_{out})
'ball': parameter is the radius r, with r \ge 0:

f(d;r) = I(d \le r)
'gauss': Gaussian function where 95% of the mass is inside the parameter r, with r \ge 0:

f(d;r) = exp(-0.5 (\Phi^{-1}(0.95) d/r)^2)

The argument kernel_type determines the used kernel function as presented above, the argument kernel_parameters gives the corresponding parameters for the kernel function. Specifically, if kernel_type equals 'ball' or 'gauss' then kernel_parameters is a numeric vector where each entry corresponds to one parameter. Hence, length(kernel_parameters) local covariance matrices are used. Whereas, if kernel_type equals 'ring', then kernel_parameters must be a numeric vector of even length where subsequently the inner and outer radii must be given (informally: c(r_in1, r_out1, r_in2, r_out2, ...)). In that case length(kernel_parameters) / 2 local covariance matrices are used.

Internally, sbss calls spatial_kernel_matrix to compute a list of c(n,n) kernel matrices based on the parameters given, where each entry of those matrices corresponds to f(d_{i,j}). Alternatively, such a list of kernel matrices can be given directly to the function sbss via the kernel_list argument. This is useful when sbss is called numerous times with the same coordinates/kernel functions as the computation of the kernel matrices is then done only once prior the actual sbss calls. For details see also spatial_kernel_matrix.

rob_whitening determines which scatter is used for the whitening step. If TRUE, whitening is carried out with respect to the scatter matrix defined by the lcov argument, where the kernel function is given by the argument kernel_type and the parameters correspond to the first occuring in the argument kernel_parameters. Therefore, at least two different kernel parameters need to be given. Note that only LDiff(f) matrices are positive definite, hence whitening with 'lcov' is likely to produce an error. If the argument is FALSE, whitening is carried out with respect to the usual sample covariance matrix. sbss internally calls white_data.

If more than one local covariance matrix is used sbss jointly diagonalizes these matrices with the function frjd. ... provides arguments for frjd, useful arguments might be:

eps: tolerance for convergence.
maxiter: maximum number of iterations.

Value

sbss returns a list of class 'sbss' with the following entries:

`s`	object of `class(x)` containing the estimated source random field.
`coords`	coordinates of the observations. Is `NULL` if `x` was a matrix and the argument `kernel_list` was not `NULL` at the `sbss` call.
`w`	estimated unmixing matrix.
`w_inv`	inverse of the estimated unmixing matrix.
`pevals`	(pseudo-)eigenvalues for each latent field entry.
`d`	matrix of stacked (jointly) diagonalized local covariance matrices with dimension `c(length(kernel_parameters)p,p)` for `'ball'` and `'gauss'` kernel or `c(` `(length(kernel_parameters) / 2)p,p)` for `'ring'` kernel.
`diags`	matrix of dimension `c(length(kernel_parameters),p)` where the rows contain the diagonal of the diagonalized local autocovariance matrices.
`x_mu`	columnmeans of `x`.
`cov_inv_sqrt`	square root of the inverse sample covariance matrix of `x`.

References

Muehlmann, C., Filzmoser, P. and Nordhausen, K. (2021), Spatial Blind Source Separation in the Presence of a Drift, Submitted for publication. Preprint available at https://arxiv.org/abs/2108.13813.

Bachoc, F., Genton, M. G, Nordhausen, K., Ruiz-Gazen, A. and Virta, J. (2020), Spatial Blind Source Separation, Biometrika, 107, 627-646, doi:10.1093/biomet/asz079.

Nordhausen, K., Oja, H., Filzmoser, P., Reimann, C. (2015), Blind Source Separation for Spatial Compositional Data, Mathematical Geosciences 47, 753-770, doi:10.1007/s11004-014-9559-5.

Muehlmann, C., Bachoc, F., Nordhausen, K. and Yi, M. (2022), Test of the Latent Dimension of a Spatial Blind Source Separation Model, to appear in Statistica Sinica, doi:10.5705/ss.202021.0326.v

Examples

# simulate coordinates
coords <- runif(1000 * 2) * 20
dim(coords) <- c(1000, 2)
coords_df <- as.data.frame(coords)
names(coords_df) <- c("x", "y")

# simulate random field
if (!requireNamespace('gstat', quietly = TRUE)) {
  message('Please install the package gstat to run the example code.')
} else {
  library(gstat)
  model_1 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, 
                   model = vgm(psill = 0.025, range = 1, model = 'Exp'), nmax = 20)
  model_2 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, 
                   model = vgm(psill = 0.025, range = 1, kappa = 2, model = 'Mat'), 
                   nmax = 20)
  model_3 <- gstat(formula = z ~ 1, locations = ~ x + y, dummy = TRUE, beta = 0, 
                   model = vgm(psill = 0.025, range = 1, model = 'Gau'), nmax = 20)
  field_1 <- predict(model_1, newdata = coords_df, nsim = 1)$sim1
  field_2 <- predict(model_2, newdata = coords_df, nsim = 1)$sim1
  field_3 <- predict(model_3, newdata = coords_df, nsim = 1)$sim1
  field <- as.matrix(cbind(field_1, field_2, field_3))

  # apply sbss with three ring kernels
  kernel_parameters <- c(0, 1, 1, 2, 2, 3)
  sbss_result <- 
    sbss(field, coords, kernel_type = 'ring', kernel_parameters = kernel_parameters)
  
  # print object
  print(sbss_result)
  
  # plot latent field
  plot(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1)
  
  # predict latent fields on grid
  predict(sbss_result, colorkey = TRUE, as.table = TRUE, cex = 1)
  
  # unmixing matrix
  w_unmix <- coef(sbss_result)
  
  # apply the same sbss with a kernel list
  kernel_list <- spatial_kernel_matrix(coords, kernel_type = 'ring', kernel_parameters)
  sbss_result_k <- sbss(field, kernel_list = kernel_list)
  
  # apply sbss with three ring kernels and local difference matrices
  sbss_result_ldiff <- 
    sbss(field, coords, kernel_type = 'ring', 
         kernel_parameters = kernel_parameters, lcov = 'ldiff')
}

[Package SpatialBSS version 0.14-0 Index]