spatpca {SpatPCA} | R Documentation |
Regularized PCA for spatial data
Description
Produce spatial dominant patterns and spatial predictions at the designated locations according to the specified tuning parameters or the selected tuning parameters by the M-fold cross-validation.
Usage
spatpca(
x,
Y,
M = 5,
K = NULL,
is_K_selected = ifelse(is.null(K), TRUE, FALSE),
tau1 = NULL,
tau2 = NULL,
gamma = NULL,
is_Y_detrended = FALSE,
maxit = 100,
thr = 1e-04,
num_cores = NULL
)
Arguments
x |
Location matrix ( |
Y |
Data matrix ( |
M |
Optional number of folds for cross validation; default is 5. |
K |
Optional user-supplied number of eigenfunctions; default is NULL. If K is NULL or is_K_selected is TRUE, K is selected automatically. |
is_K_selected |
If TRUE, K is selected automatically; otherwise, is_K_selected is set to be user-supplied K. Default depends on user-supplied K. |
tau1 |
Optional user-supplied numeric vector of a non-negative smoothness parameter sequence. If NULL, 10 tau1 values in a range are used. |
tau2 |
Optional user-supplied numeric vector of a non-negative sparseness parameter sequence. If NULL, none of tau2 is used. |
gamma |
Optional user-supplied numeric vector of a non-negative tuning parameter sequence. If NULL, 10 values in a range are used. |
is_Y_detrended |
If TRUE, center the columns of Y. Default is FALSE. |
maxit |
Maximum number of iterations. Default value is 100. |
thr |
Threshold for convergence. Default value is |
num_cores |
Number of cores used to parallel computing. Default value is NULL (See |
Details
An ADMM form of the proposed objective function is written as
\min_{\mathbf{\Phi}} \|\mathbf{Y}-\mathbf{Y}\mathbf{\Phi}\mathbf{\Phi}'\|^2_F +\tau_1\mbox{tr}(\mathbf{\Phi}^T\mathbf{\Omega}\mathbf{\Phi})+\tau_2\sum_{k=1}^K\sum_{j=1}^p |\phi_{jk}|,
\mbox{subject to $ \mathbf{\Phi}^T\mathbf{\Phi}=\mathbf{I}_K$,}
where \mathbf{Y}
is a data matrix, {\mathbf{\Omega}}
is a smoothness matrix, and \mathbf{\Phi}=\{\phi_{jk}\}
.
Value
A list of objects including
eigenfn |
Estimated eigenfunctions at the new locations, x_new. |
selected_K |
Selected K based on CV. Execute the algorithm when |
selected_tau1 |
Selected tau1. |
selected_tau2 |
Selected tau2. |
selected_gamma |
Selected gamma. |
cv_score_tau1 |
cv scores for tau1. |
cv_score_tau2 |
cv scores for tau2. |
cv_score_gamma |
cv scores for gamma. |
tau1 |
Sequence of tau1-values used in the process. |
tau2 |
Sequence of tau2-values used in the process. |
gamma |
Sequence of gamma-values used in the process. |
detrended_Y |
If is_Y_detrended is TRUE, detrended_Y means Y is detrended; else, detrended_Y is equal to Y. |
scaled_x |
Input location matrix. Only scale when it is one-dimensional |
Author(s)
Wen-Ting Wang and Hsin-Cheng Huang
References
Wang, W.-T. and Huang, H.-C. (2017). Regularized principal component analysis for spatial data. Journal of Computational and Graphical Statistics 26 14-25.
See Also
Examples
# The following examples only use two threads for parallel computing.
## 1D: regular locations
x_1D <- as.matrix(seq(-5, 5, length = 50))
Phi_1D <- exp(-x_1D^2) / norm(exp(-x_1D^2), "F")
set.seed(1234)
Y_1D <- rnorm(n = 100, sd = 3) %*% t(Phi_1D) + matrix(rnorm(n = 100 * 50), 100, 50)
cv_1D <- spatpca(x = x_1D, Y = Y_1D, num_cores = 2)
plot(x_1D, cv_1D$eigenfn[, 1], type = "l", main = "1st eigenfunction")
lines(x_1D, svd(Y_1D)$v[, 1], col = "red")
legend("topleft", c("SpatPCA", "PCA"), lty = 1:1, col = 1:2)
## 2D: Daily 8-hour ozone averages for sites in the Midwest (USA)
library(fields)
library(pracma)
library(maps)
data(ozone2)
x <- ozone2$lon.lat
Y <- ozone2$y
date <- as.Date(ozone2$date, format = "%y%m%d")
rmna <- !colSums(is.na(Y))
YY <- matrix(Y[, rmna], nrow = nrow(Y))
YY <- detrend(YY, "linear")
xx <- x[rmna, ]
cv <- spatpca(x = xx, Y = YY)
quilt.plot(xx, cv$eigenfn[, 1])
map("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), add = TRUE)
map.text("state", xlim = range(xx[, 1]), ylim = range(xx[, 2]), cex = 2, add = TRUE)
plot(date, YY %*% cv$eigenfn[, 1], type = "l", ylab = "1st Principal Component")
### new loactions
new_p <- 200
x_lon <- seq(min(xx[, 1]), max(xx[, 1]), length = new_p)
x_lat <- seq(min(xx[, 2]), max(xx[, 2]), length = new_p)
xx_new <- as.matrix(expand.grid(x = x_lon, y = x_lat))
eof <- spatpca(x = xx,
Y = YY,
K = cv$selected_K,
tau1 = cv$selected_tau1,
tau2 = cv$selected_tau2)
predicted_eof <- predictEigenfunction(eof, xx_new)
quilt.plot(xx_new,
predicted_eof[,1],
nx = new_p,
ny = new_p,
xlab = "lon.",
ylab = "lat.")
map("state", xlim = range(x_lon), ylim = range(x_lat), add = TRUE)
map.text("state", xlim = range(x_lon), ylim = range(x_lat), cex = 2, add = TRUE)
## 3D: regular locations
p <- 10
x <- y <- z <- as.matrix(seq(-5, 5, length = p))
d <- expand.grid(x, y, z)
Phi_3D <- rowSums(exp(-d^2)) / norm(as.matrix(rowSums(exp(-d^2))), "F")
Y_3D <- rnorm(n = 100, sd = 3) %*% t(Phi_3D) + matrix(rnorm(n = 100 * p^3), 100, p^3)
cv_3D <- spatpca(x = d, Y = Y_3D, tau2 = seq(0, 1000, length = 10))
library(plot3D)
library(RColorBrewer)
cols <- colorRampPalette(brewer.pal(9, "Blues"))(p)
isosurf3D(x, y, z,
colvar = array(cv_3D$eigenfn[, 1], c(p, p, p)),
level= seq(min(cv_3D$eigenfn[, 1]), max(cv_3D$eigenfn[, 1]), length = p),
ticktype = "detailed",
colkey = list(side = 1),
col = cols)