penalty {gps}R Documentation

Wiggliness penalties for penalized B-splines

Description

For penalized B-splines (including standard or general P-splines and O-splines), (1) construct matrix D\bm{D} in the wiggliness penalty Dβ2\|\bm{D\beta}\|^2; (2) sample B-spline coefficients from their prior distribution N(0, (DD))\textrm{N}(\bm{0},\ (\bm{D'D})^-); (3) compute the Moore-Penrose generalized inverse matrix (DD)(\bm{D'D})^-.

Usage

SparseD(xt, d, m = NULL, gps = TRUE)

PriorCoef(n, D)

MPinv(D, only.diag = FALSE)

Arguments

xt

full knot sequence for ordinary B-splines (length(xt) >= 2 * d).

d

B-spline order (d2d \ge 2).

m

penalty order (1md11 \le m \le d - 1). Can be a vector of multiple values for SparseD.

gps

if TRUE, return Dgps\bm{D}_{\textrm{gps}}; if FALSE, return Dos\bm{D}_{\textrm{os}}.

n

number of samples to draw from the prior distribution.

D

matrix Dgps\bm{D}_{\textrm{gps}} or Dos\bm{D}_{\textrm{os}}.

only.diag

if TURE, only diagonal elements are computed.

Details

General Difference Penalty for General P-Splines

A general P-spline is characterized by an order-mm general difference matrix Dgps\bm{D}_{\textrm{gps}}, which can be computed by SparseD(..., gps = TRUE). For interpretation, the differenced coefficients Dgpsβ\bm{D}_{\textrm{gps}}\bm{\beta} are in fact f(m)(x)f^{(m)}(x)'s B-spline coefficients, so the penalty is their squared L2L_2 norm.

Derivative Penalty for O-Splines

An O-spline is characterized by Dos\bm{D}_{\textrm{os}} such that Dosβ2=abf(m)(x)2dx\|\bm{D}_{\textrm{os}}\bm{\beta}\|^2 = \int_a^b f^{(m)}(x)^2\textrm{d}x. Since f(m)(x)f^{(m)}(x) has B-spline coefficients Dgpsβ\bm{D}_{\textrm{gps}}\bm{\beta}, the integral can be shown to be βDgpsSˉDgpsβ\bm{\beta'}\bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}\bm{\beta}, where Sˉ\bm{\bar{S}} is the Gram matrix of those B-splines representing f(m)(x)f^{(m)}(x). Following the Cholesky factorization Sˉ=UU\bm{\bar{S}} = \bm{U'U}, the quadratic form becomes UDgpsβ2\|\bm{U}\bm{D}_{\textrm{gps}}\bm{\beta}\|^2, so that Dos=UDgps\bm{D}_{\textrm{os}} = \bm{U}\bm{D}_{\textrm{gps}}. This matrix can be computed by SparseD(..., gps = FALSE), with Sˉ\bm{\bar{S}} and Dgps\bm{D}_{\textrm{gps}} also returned in a "sandwich" attribute.

Penalty Matrix

We can express the L2L_2 penalty Dβ2\|\bm{D\beta}\|^2 as quadratic form βSβ\bm{\beta'S\beta}, where S=DD\bm{S} = \bm{D'D} is called a penalty matrix. It is trivial to compute S\bm{S} (using function crossprod) once D\bm{D} is available, so we don't feel the need to provide a function for this. Note that the link between Dos\bm{D}_{\textrm{os}} and Dgps\bm{D}_{\textrm{gps}} implies a sandwich formula Sos=DgpsSˉDgps\bm{S}_{\textrm{os}} = \bm{D}_{\textrm{gps}}'\bm{\bar{S}}\bm{D}_{\textrm{gps}}, wherease Sgps=DgpsDgps\bm{S}_{\textrm{gps}} = \bm{D}_{\textrm{gps}}'\bm{D}_{\textrm{gps}}.

The Bayesian View

In the Bayesian view, the penalty βSβ\bm{\beta'S\beta} is a Gaussian prior for B-spline coefficients β\bm{\beta}. But it is an improper one because S\bm{S} has a null space where an unpenalized order-mm polynomial lies. Let's decompose β=ξ+θ\bm{\beta} = \bm{\xi} + \bm{\theta}, where ξ\bm{\xi} (the projection of β\bm{\beta} on this null space) is the coefficients of this order-mm polynomial, and θ\bm{\theta} (orthogonal to ξ\bm{\xi}) is the component that can be shrunk to zero by the penalty. As a result, ξ1\bm{\xi} \propto \bm{1} is not proper, but θN(0, S)\bm{\theta} \sim \textrm{N}(\bm{0},\ \bm{S}^-) is. Function PriorCoef samples this distribution, and the resulting B-spline coefficients can be used to create random spline curves. The algorithm behind PriorCoef bypasses the Moore-Penrose generalized inverse and is very efficient. We don't recommend forming this inverse matrix because it, being completely dense, is expensive to compute and store. But if we need it anyway, it can be computed using function MPinv.

Value

SparseD returns a list of sparse matrices (of "dgCMatrix" class), giving Dgps\bm{D}_{\textrm{gps}} or Dos\bm{D}_{\textrm{os}} of order m[1], m[2], ..., m[length(m)]. In the latter case, Sˉ\bm{\bar{S}} (sparse matrices of "dsCMatrix" or "ddiMatrix" class) and Dgps\bm{D}_{\textrm{gps}} for computing Dos\bm{D}_{\textrm{os}} are also returned in a "sandwich" attribute.

PriorCoef returns a list of two components:

MPinv returns the dense Moore-Penrose generalized inverse matrix (DD)\bm{(D'D})^- if only.diag = FALSE, and the diagonal entries of this matrix if only.diag = TRUE.

Author(s)

Zheyuan Li zheyuan.li@bath.edu

References

Zheyuan Li and Jiguo Cao (2022). General P-splines for non-uniform splines, doi:10.48550/arXiv.2201.06808

Examples

require(Matrix)
require(gps)

## 11 domain knots at equal quantiles of Beta(3, 3) distribution
xd <- qbeta(seq.int(0, 1, by = 0.1), 3, 3)
## full knots (with clamped boundary knots) for constructing cubic B-splines
xt <- c(0, 0, 0, xd, 1, 1, 1)

## compute D matrices of order 1 to 3 for O-splines
D.os <- SparseD(xt, d = 4, gps = FALSE)
D1.os <- D.os[[1]]; D2.os <- D.os[[2]]; D3.os <- D.os[[3]]

## get D matrices of order 1 to 3 for general P-splines
## we can of course compute them with D.gps <- SparseD(xt, d = 4, gps = TRUE)
## but they are readily stored in the "sandwich" attribute of 'D.os'
D.gps <- attr(D.os, "sandwich")$D
D1.gps <- D.gps[[1]]; D2.gps <- D.gps[[2]]; D3.gps <- D.gps[[3]]

## we can compute the penalty matrix S = D'D
S.gps <- lapply(D.gps, crossprod)
S1.gps <- S.gps[[1]]; S2.gps <- S.gps[[2]]; S3.gps <- S.gps[[3]]
S.os <- lapply(D.os, crossprod)
S1.os <- S.os[[1]]; S2.os <- S.os[[2]]; S3.os <- S.os[[3]]

## if we want to verify the sandwich formula for O-splines
## extract 'Sbar' matrices stored in the "sandwich" attribute
## and compute the relative error between S and t(D) %*% Sbar %*% D
Sbar <- attr(D.os, "sandwich")$Sbar
Sbar1 <- Sbar[[1]]; Sbar2 <- Sbar[[2]]; Sbar3 <- Sbar[[3]]
range(S1.os - t(D1.gps) %*% Sbar1 %*% D1.gps) / max(abs(S1.os))
range(S2.os - t(D2.gps) %*% Sbar2 %*% D2.gps) / max(abs(S2.os))
range(S3.os - t(D3.gps) %*% Sbar3 %*% D3.gps) / max(abs(S3.os))

## sample B-spline coefficients from their prior distribution
b.gps <- PriorCoef(n = 5, D2.gps)$coef
b.os <- PriorCoef(n = 5, D2.os)$coef
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
## prior B-spline coefficients with a general difference penalty
matplot(b.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
## prior B-spline coefficients with a derivative penalty
matplot(b.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random B-spline coefficients from their prior", outer = TRUE)
par(op)

## plot the corresponding cubic splines with these B-spline coefficients
x <- MakeGrid(xd, n = 11)
B <- splines::splineDesign(xt, x, ord = 4, sparse = TRUE)
y.gps <- B %*% b.gps
y.os <- B %*% b.os
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5), oma = c(0, 0, 1, 0))
matplot(x, y.gps, type = "l", lty = 1, ann = FALSE)
title("general difference penalty")
matplot(x, y.os, type = "l", lty = 1, ann = FALSE)
title("derivative penalty")
title("random cubic splines with prior B-spline coefficients", outer = TRUE)
par(op)

[Package gps version 1.2 Index]