R: Thin Plate Spline Basis and Penalty

thinplate {npreg}

R Documentation

Thin Plate Spline Basis and Penalty

Description

Generate the smoothing spline basis and penalty matrix for a thin plate spline.

Usage

basis.tps(x, knots, m = 2, rk = TRUE, intercept = FALSE, ridge = FALSE)

penalty.tps(x, m = 2, rk = TRUE)

Arguments

`x`	Predictor variables (basis) or spline knots (penalty). Numeric or integer vector of length `n`, or matrix of dimension `n` by `p`.
`knots`	Spline knots. Numeric or integer vector of length `r`, or matrix of dimension `r` by `p`.
`m`	Penalty order. "m=1" for linear thin plate spline, "m=2" for cubic, and "m=3" for quintic. Must satisfy `2m > p`.
`rk`	If true (default), the reproducing kernel parameterization is used. Otherwise, the classic thin plate basis is returned.
`intercept`	If `TRUE`, the first column of the basis will be a column of ones.
`ridge`	If `TRUE`, the basis matrix is post-multiplied by the inverse square root of the penalty matrix. Only applicable if `rk = TRUE`. See Note and Examples.

Details

Generates a basis function or penalty matrix used to fit linear, cubic, and quintic thin plate splines.

The basis function matrix has the form

X = [X_0, X_1]

where the matrix X_0 is of dimension n by M-1 (plus 1 if an intercept is included) where M = {p+m-1 \choose p}, and X_1 is a matrix of dimension n by r.

The X_0 matrix contains the "parametric part" of the basis, which includes polynomial functions of the columns of x up to degree m-1 (and potentially interactions).

The matrix X_1 contains the "nonparametric part" of the basis.

If rk = TRUE, the matrix X_1 consists of the reproducing kernel function

\rho(x, y) = (I - P_x) (I - P_y) E(|x - y|)

evaluated at all combinations of x and knots. Note that P_x and P_y are projection operators, |.| denotes the Euclidean distance, and the TPS semi-kernel is defined as

E(z) = \alpha z^{2m-p} \log(z)

if p is even and

E(z) = \beta z^{2m-p}

otherwise, where \alpha and \beta are positive constants (see References).

If rk = FALSE, the matrix X_1 contains the TPS semi-kernel E(.) evaluated at all combinations of x and knots. Note: the TPS semi-kernel is not positive (semi-)definite, but the projection is.

If rk = TRUE, the penalty matrix consists of the reproducing kernel function

\rho(x, y) = (I - P_x) (I - P_y) E(|x - y|)

evaluated at all combinations of x. If rk = FALSE, the penalty matrix contains the TPS semi-kernel E(.) evaluated at all combinations of x.

Value

Basis: Matrix of dimension c(length(x), df) where df = nrow(as.matrix(knots)) + choose(p + m - 1, p) - !intercept and p = ncol(as.matrix(x)).

Penalty: Matrix of dimension c(r, r) where r = nrow(as.matrix(x)) is the number of knots.

Note

The inputs x and knots must have the same dimension.

If rk = TRUE and ridge = TRUE, the penalty matrix is the identity matrix.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Gu, C. (2013). Smoothing Spline ANOVA Models. 2nd Ed. New York, NY: Springer-Verlag. doi:10.1007/978-1-4614-5369-7

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13. doi:10.3389/fams.2017.00015

Helwig, N. E., & Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24(3), 715-732. doi:10.1080/10618600.2014.926819

Examples

######***######   standard parameterization   ######***######

# generate data
set.seed(0)
n <- 101
x <- seq(0, 1, length.out = n)
knots <- seq(0, 0.95, by = 0.05)
eta <- 1 + 2 * x + sin(2 * pi * x)
y <- eta + rnorm(n, sd = 0.5)

# cubic thin plate spline basis
X <- basis.tps(x, knots, intercept = TRUE)

# cubic thin plate spline penalty
Q <- penalty.tps(knots)

# pad Q with zeros (for intercept and linear effect)
Q <- rbind(0, 0, cbind(0, 0, Q))

# define smoothing parameter
lambda <- 1e-5

# estimate coefficients
coefs <- psolve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)

# estimate eta
yhat <- X %*% coefs

# check rmse
sqrt(mean((eta - yhat)^2))

# plot results
plot(x, y)
lines(x, yhat)



######***######   ridge parameterization   ######***######

# generate data
set.seed(0)
n <- 101
x <- seq(0, 1, length.out = n)
knots <- seq(0, 0.95, by = 0.05)
eta <- 1 + 2 * x + sin(2 * pi * x)
y <- eta + rnorm(n, sd = 0.5)

# cubic thin plate spline basis
X <- basis.tps(x, knots, intercept = TRUE, ridge = TRUE)

# cubic thin plate spline penalty (ridge)
Q <- diag(rep(c(0, 1), times = c(2, ncol(X) - 2)))

# define smoothing parameter
lambda <- 1e-5

# estimate coefficients
coefs <- psolve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)

# estimate eta
yhat <- X %*% coefs

# check rmse
sqrt(mean((eta - yhat)^2))

# plot results
plot(x, y)
lines(x, yhat)

[Package npreg version 1.1.0 Index]