gps2GS {gps} | R Documentation |
Penalized B-splines estimation with automatic grid search of their smoothing parameter
Description
Fit penalized B-splines (including standard or general P-splines and O-splines) to (x, y, w)
for a grid of smoothing parameter values in the automatic search intervals of Li and Cao (2023). The GCV score and effective degree of freedom of each fit are also returned.
Usage
gps2GS(x, y, w = NULL, xt, d = 4, m = 2, gps = TRUE, periodic = FALSE,
ng = 20, scalePen = TRUE)
DemoRhoLim(fit, plot = TRUE)
Arguments
x , y , w |
a vector of |
xt |
full knot sequence for ordinary B-splines ( |
d |
B-spline order ( |
m |
penalty order ( |
gps |
if TRUE, use a difference penalty; if FALSE, use a derivative penalty. |
periodic |
if TRUE, periodic boundary conditions are applied to B-splines and their penalty, so that periodic P-splines are estimated. |
ng |
number of grid points in the grid search of |
scalePen |
if TRUE, scale the penalty matrix |
fit |
fitted P-splines returned by |
plot |
if TRUE, produce summary plots. |
Details
We smooth y_i
using f(x_i) = \bm{B_i\beta}
, where \bm{B_i}
is i
-th row of the B-spline design matrix \bm{B}
and \bm{\beta}
is a vector of B-spline coefficients. These coefficients are estimated by minimizing:
\|\bm{y} - \bm{B\beta}\|^2 + \exp(\rho)\cdot\|\bm{D\beta}\|^2,
where the L_2
penalty \|\bm{D\beta}\|^2
is some wiggliness measure for f(x)
and \rho \in (-\infty, +\infty)
is a smoothing parameter.
Value
gps2GS
returns a large list with the following components:
-
eqn
-
eigen
-
rho.lim
-
E
-
pwls
Author(s)
Zheyuan Li zheyuan.li@bath.edu
References
Zheyuan Li and Jiguo Cao (2023). Automatic search intervals for the smoothing parameter in penalized splines, Statistics and Computing, doi:10.1007/s11222-022-10178-z
Examples
require(gps)
x <- rnorm(100)
xt <- PlaceKnots(x, d = 4, k = 10)
## set ng = 0 to set up grid search only
## here the y-values does not matter; we simply use the x-values
setup <- gps2GS(x, x, xt = xt, d = 4, m = 2, ng = 0)
## compute exact eigenvalues
DemoResult <- DemoRhoLim(setup)
## simulate 100 (x, y) data from g(x) = sin(2 * pi * x) on [0, 1]
## x-values are not equidistant but at quantiles of Beta(2, 2)
## note that g(x) is a periodic function
x <- qbeta(seq.int(0, 1, length.out = 100), 2, 2)
gx <- sin(2 * pi * x)
y <- rnorm(length(x), gx, sd = 0.1)
## place quantile knots with clamped boundary knots
xt <- PlaceKnots(x, d = 4, k = 10)
## fit a general P-spline with different boundary constraints
ordinary <- gps2GS(x, y, xt = xt, d = 4, m = 2)
periodic <- gps2GS(x, y, xt = xt, d = 4, m = 2, periodic = TRUE)
## identify the optimal fit minimizing GCV score
opt.ordinary <- which.min(ordinary$pwls$gcv)
opt.periodic <- which.min(periodic$pwls$gcv)
## inspect grid search result
## column 1: ordinary cubic spline
## column 2: periodic cubic spline
op <- par(mfcol = c(2, 2), mar = c(2, 2, 1.5, 0.5))
## ordinary spline
with(ordinary$pwls, plot(rho, edf, ann = FALSE))
title("edf v.s. log(lambda)")
with(ordinary$pwls, plot(rho, gcv, ann = FALSE))
with(ordinary$pwls, points(rho[opt.ordinary], gcv[opt.ordinary], pch = 19))
title("GCV v.s. log(lambda)")
## periodic spline
with(periodic$pwls, plot(rho, edf, ann = FALSE))
title("edf v.s. log(lambda)")
with(periodic$pwls, plot(rho, gcv, ann = FALSE))
with(periodic$pwls, points(rho[opt.periodic], gcv[opt.periodic], pch = 19))
title("GCV v.s. log(lambda)")
par(op)
## inspect fitted splines
yhat.ordinary <- with(ordinary, eqn$B %*% pwls$beta)
yhat.periodic <- with(periodic, eqn$B %*% pwls$beta)
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5))
## ordinary spline
matplot(x, yhat.ordinary, type = "l", lty = 1, ann = FALSE)
title("ordinary")
## periodic spline
matplot(x, yhat.periodic, type = "l", lty = 1, ann = FALSE)
title("periodic")
par(op)
## pick and plot the optimal fit minimizing GCV score
best.ordinary <- yhat.ordinary[, opt.ordinary]
best.periodic <- yhat.periodic[, opt.periodic]
op <- par(mfrow = c(1, 2), mar = c(2, 2, 1.5, 0.5))
## ordinary spline
plot(x, y, ann = FALSE)
lines(x, gx, lwd = 2, col = 2)
lines(x, best.ordinary, lwd = 2)
title("ordinary")
## periodic spline
plot(x, y, ann = FALSE)
lines(x, gx, lwd = 2, col = 2)
lines(x, best.periodic, lwd = 2)
title("periodic")
par(op)