penalty {gps} | R Documentation |
Wiggliness penalties for penalized B-splines
Description
For penalized B-splines (including standard or general P-splines and O-splines), (1) construct matrix \bm{D}
in the wiggliness penalty \|\bm{D\beta}\|^2
; (2) sample B-spline coefficients from their prior distribution \textrm{N}(\bm{0},\ (\bm{D'D})^-)
; (3) compute the Moore-Penrose generalized inverse matrix (\bm{D'D})^-
.
Usage
SparseD(xt, d, m = NULL, gps = TRUE)
PriorCoef(n, D)
MPinv(D, only.diag = FALSE)
Arguments
xt |
full knot sequence for ordinary B-splines ( |
d |
B-spline order ( |
m |
penalty order ( |
gps |
if TRUE, return |
n |
number of samples to draw from the prior distribution. |
D |
matrix |
only.diag |
if TURE, only diagonal elements are computed. |
Details
General Difference Penalty for General P-Splines
A general P-spline is characterized by an order-m
general difference matrix \bm{D}_{\textrm{gps}}
, which can be computed by SparseD(..., gps = TRUE)
. For interpretation, the differenced coefficients \bm{D}_{\textrm{gps}}\bm{\beta}
are in fact f^{(m)}(x)
's B-spline coefficients, so the penalty is their squared L_2
norm.
Derivative Penalty for O-Splines
An O-spline is characterized by \bm{D}_{\textrm{os}}
such that \|\bm{D}_{\textrm{os}}\bm{\beta}\|^2 = \int_a^b f^{(m)}(x)^2\textrm{d}x
. Since f^{(m)}(x)
has B-spline coefficients \bm{D}_{\textrm{gps}}\bm{\beta}
, the integral can be shown to be \bm{\beta'}\bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}\bm{\beta}
, where \bm{\bar{S}}
is the Gram matrix of those B-splines representing f^{(m)}(x)
. Following the Cholesky factorization \bm{\bar{S}} = \bm{U'U}
, the quadratic form becomes \|\bm{U}\bm{D}_{\textrm{gps}}\bm{\beta}\|^2
, so that \bm{D}_{\textrm{os}} = \bm{U}\bm{D}_{\textrm{gps}}
. This matrix can be computed by SparseD(..., gps = FALSE)
, with \bm{\bar{S}}
and \bm{D}_{\textrm{gps}}
also returned in a "sandwich" attribute.
Penalty Matrix
We can express the L_2
penalty \|\bm{D\beta}\|^2
as quadratic form \bm{\beta'S\beta}
, where \bm{S} = \bm{D'D}
is called a penalty matrix. It is trivial to compute \bm{S}
(using function crossprod
) once \bm{D}
is available, so we don't feel the need to provide a function for this. Note that the link between \bm{D}_{\textrm{os}}
and \bm{D}_{\textrm{gps}}
implies a sandwich formula \bm{S}_{\textrm{os}} = \bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}
, wherease \bm{S}_{\textrm{gps}} = \bm{D}_{\textrm{gps}}'\bm{D}_{\textrm{gps}}
.
The Bayesian View
In the Bayesian view, the penalty \bm{\beta'S\beta}
is a Gaussian prior for B-spline coefficients \bm{\beta}
. But it is an improper one because \bm{S}
has a null space where an unpenalized order-m
polynomial lies. Let's decompose \bm{\beta} = \bm{\xi} + \bm{\theta}
, where \bm{\xi}
(the projection of \bm{\beta}
on this null space) is the coefficients of this order-m
polynomial, and \bm{\theta}
(orthogonal to \bm{\xi}
) is the component that can be shrunk to zero by the penalty. As a result, \bm{\xi} \propto \bm{1}
is not proper, but \bm{\theta} \sim \textrm{N}(\bm{0},\ \bm{S}^-)
is. Function PriorCoef
samples this distribution, and the resulting B-spline coefficients can be used to create random spline curves. The algorithm behind PriorCoef
bypasses the Moore-Penrose generalized inverse and is very efficient. We don't recommend forming this inverse matrix because it, being completely dense, is expensive to compute and store. But if we need it anyway, it can be computed using function MPinv
.
Value
SparseD
returns a list of sparse matrices (of "dgCMatrix" class), giving \bm{D}_{\textrm{gps}}
or \bm{D}_{\textrm{os}}
of order m[1]
, m[2]
, ..., m[length(m)]
. In the latter case, \bm{\bar{S}}
(sparse matrices of "dsCMatrix" or "ddiMatrix" class) and \bm{D}_{\textrm{gps}}
for computing \bm{D}_{\textrm{os}}
are also returned in a "sandwich" attribute.
PriorCoef
returns a list of two components:
-
coef
gives a vector of B-spline coefficients whenn = 1
, or a matrix ofn
columns whenn > 1
, where each column is an independent sample; -
sigma
is a vector, giving the marginal standard deviation for each B-spline coefficient.
MPinv
returns the dense Moore-Penrose generalized inverse matrix \bm{(D'D})^-
if only.diag = FALSE
, and the diagonal entries of this matrix if only.diag = TRUE
.
Author(s)
Zheyuan Li zheyuan.li@bath.edu
References
Zheyuan Li and Jiguo Cao (2022). General P-splines for non-uniform splines, doi:10.48550/arXiv.2201.06808
Examples
require(Matrix)
require(gps)
## 11 domain knots at equal quantiles of Beta(3, 3) distribution
xd <- qbeta(seq.int(0, 1, by = 0.1), 3, 3)
## full knots (with clamped boundary knots) for constructing cubic B-splines
xt <- c(0, 0, 0, xd, 1, 1, 1)
## compute D matrices of order 1 to 3 for O-splines
D.os <- SparseD(xt, d = 4, gps = FALSE)
D1.os <- D.os[[1]]; D2.os <- D.os[[2]]; D3.os <- D.os[[3]]
## get D matrices of order 1 to 3 for general P-splines
## we can of course compute them with D.gps <- SparseD(xt, d = 4, gps = TRUE)
## but they are readily stored in the "sandwich" attribute of 'D.os'
D.gps <- attr(D.os, "sandwich")$D
D1.gps <- D.gps[[1]]; D2.gps <- D.gps[[2]]; D3.gps <- D.gps[[3]]
## we can compute the penalty matrix S = D'D
S.gps <- lapply(D.gps, crossprod)
S1.gps <- S.gps[[1]]; S2.gps <- S.gps[[2]]; S3.gps <- S.gps[[3]]
S.os <- lapply(D.os, crossprod)
S1.os <- S.os[[1]]; S2.os <- S.os[[2]]; S3.os <- S.os[[3]]
## if we want to verify the sandwich formula for O-splines
## extract 'Sbar' matrices stored in the "sandwich" attribute
## and compute the relative error between S and t(D) %*% Sbar %*% D
Sbar <- attr(D.os, "sandwich")$Sbar
Sbar1 <- Sbar[[1]]; Sbar2 <- Sbar[[2]]; Sbar3 <- Sbar[[3]]
range(S1.os - t(D1.gps) %*% Sbar1 %*% D1.gps) / max(abs(S1.os))
range(S2.os - t(D2.gps) %*% Sbar2 %*% D2.gps) / max(abs(S2.os))
range(S3.os - t(D3.gps) %*% Sbar3 %*% D3.gps) / max(abs(S3.os))
## sample B-spline coefficients from their prior distribution
b.gps <- PriorCoef(n = 5, D2.gps)$coef
b.os <- PriorCoef(n = 5, D2.os)$coef
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
## prior B-spline coefficients with a general difference penalty
matplot(b.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
## prior B-spline coefficients with a derivative penalty
matplot(b.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random B-spline coefficients from their prior", outer = TRUE)
par(op)
## plot the corresponding cubic splines with these B-spline coefficients
x <- MakeGrid(xd, n = 11)
B <- splines::splineDesign(xt, x, ord = 4, sparse = TRUE)
y.gps <- B %*% b.gps
y.os <- B %*% b.os
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
matplot(x, y.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
matplot(x, y.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random cubic splines with prior B-spline coefficients", outer = TRUE)
par(op)