ordinal {npreg} | R Documentation |
Ordinal Smoothing Spline Basis and Penalty
Description
Generate the smoothing spline basis and penalty matrix for an ordinal spline. This basis and penalty are for an ordered factor.
Usage
basis.ord(x, knots, K = NULL, intercept = FALSE, ridge = FALSE)
penalty.ord(x, K = NULL, xlev = NULL)
Arguments
x |
Predictor variable (basis) or spline knots (penalty). Ordered factor or integer vector of length |
knots |
Spline knots. Ordered factor or integer vector of length |
K |
Number of levels of |
xlev |
Factor levels of |
intercept |
If |
ridge |
If |
Details
Generates a basis function or penalty matrix used to fit ordinal smoothing splines.
With an intercept included, the basis function matrix has the form
X = [X_0, X_1]
where matrix X_0
is an n
by 1 matrix of ones, and X_1
is a matrix of dimension n
by r
. The X_0
matrix contains the "parametric part" of the basis (i.e., the intercept). The matrix X_1
contains the "nonparametric part" of the basis, which consists of the reproducing kernel function
\rho(x, y) = 1 - (x \vee y) + (1 / 2K) * ( x(x-1) + y(y-1) ) + c
evaluated at all combinations of x
and knots
. The notation (x \vee y)
denotes the maximum of x
and y
, and the constant is c = (K-1)(2K-1) / (6K)
.
The penalty matrix consists of the reproducing kernel function
\rho(x, y) = 1 - (x \vee y) + (1 / 2K) * ( x(x-1) + y(y-1) ) + c
evaluated at all combinations of x
.
Value
Basis: Matrix of dimension c(length(x), df)
where df = length(knots) + intercept
.
Penalty: Matrix of dimension c(r, r)
where r = length(x)
is the number of knots.
Note
If the inputs x
and knots
are factors, they should have the same levels.
If the inputs x
and knots
are integers, the knots
should be a subset of the input x
.
If ridge = TRUE
, the penalty matrix is the identity matrix.
Author(s)
Nathaniel E. Helwig <helwig@umn.edu>
References
Gu, C. (2013). Smoothing Spline ANOVA Models. 2nd Ed. New York, NY: Springer-Verlag. doi:10.1007/978-1-4614-5369-7
Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13. doi:10.3389/fams.2017.00015
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi:10.4135/9781526421036885885
See Also
See nominal
for a basis and penalty for unordered factors.
See polynomial
for a basis and penalty for numeric variables.
Examples
######***###### standard parameterization ######***######
# generate data
set.seed(0)
n <- 101
x <- factor(sort(rep(LETTERS[1:4], length.out = n)))
knots <- LETTERS[1:3]
eta <- 1:4
y <- eta[x] + rnorm(n, sd = 0.5)
# ordinal smoothing spline basis
X <- basis.ord(x, knots, intercept = TRUE)
# ordinal smoothing spline penalty
Q <- penalty.ord(knots, K = 4)
# pad Q with zeros (for intercept)
Q <- rbind(0, cbind(0, Q))
# define smoothing parameter
lambda <- 1e-5
# estimate coefficients
coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)
# estimate eta
yhat <- X %*% coefs
# check rmse
sqrt(mean((eta[x] - yhat)^2))
######***###### ridge parameterization ######***######
# generate data
set.seed(0)
n <- 101
x <- factor(sort(rep(LETTERS[1:4], length.out = n)))
knots <- LETTERS[1:3]
eta <- 1:4
y <- eta[x] + rnorm(n, sd = 0.5)
# ordinal smoothing spline basis
X <- basis.ord(x, knots, intercept = TRUE, ridge = TRUE)
# ordinal smoothing spline penalty (ridge)
Q <- diag(rep(c(0, 1), times = c(1, ncol(X) - 1)))
# define smoothing parameter
lambda <- 1e-5
# estimate coefficients
coefs <- solve(crossprod(X) + n * lambda * Q) %*% crossprod(X, y)
# estimate eta
yhat <- X %*% coefs
# check rmse
sqrt(mean((eta[x] - yhat)^2))