ordspline {bigsplines} | R Documentation |
Fits Ordinal Smoothing Spline
Description
Given a real-valued response vector \mathbf{y}=\{y_{i}\}_{n\times1}
and an ordinal predictor vector \mathbf{x}=\{x_{i}\}_{n\times 1}
with x_{i} \in \{1,\ldots,K\} \ \forall i
, an ordinal smoothing spline model has the form
y_{i}=\eta(x_{i})+e_{i}
where y_{i}
is the i
-th observation's respone, x_{i}
is the i
-th observation's predictor, \eta
is an unknown function relating the response and predictor, and e_{i}\sim\mathrm{N}(0,\sigma^{2})
is iid Gaussian error.
Usage
ordspline(x, y, knots, weights, lambda, monotone=FALSE)
Arguments
x |
Predictor vector. |
y |
Response vector. Must be same length as |
knots |
Either a scalar giving the number of equidistant knots to use, or a vector of values to use as the spline knots. If left blank, the number of knots is |
weights |
Weights vector (for weighted penalized least squares). Must be same length as |
lambda |
Smoothing parameter. If left blank, |
monotone |
If |
Details
To estimate \eta
I minimize the penalized least-squares functional
\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\eta(x_{i}))^{2}+\lambda \sum_{x=2}^K [\eta(x)-\eta(x-1)]^2 dx
where \lambda\geq0
is a smoothing parameter that controls the trade-off between fitting and smoothing the data.
Default use of the function estimates \lambda
by minimizing the GCV score:
\mbox{GCV}(\lambda) = \frac{n\|(\mathbf{I}_{n}-\mathbf{S}_{\lambda})\mathbf{y}\|^{2}}{[n-\mathrm{tr}(\mathbf{S}_{\lambda})]^2}
where \mathbf{I}_{n}
is the identity matrix and \mathbf{S}_{\lambda}
is the smoothing matrix.
Value
fitted.values |
Vector of fitted values. |
se.fit |
Vector of standard errors of |
sigma |
Estimated error standard deviation, i.e., |
lambda |
Chosen smoothing parameter. |
info |
Model fit information: vector containing the GCV, R-squared, AIC, and BIC of fit model (assuming Gaussian error). |
coef |
Spline basis function coefficients. |
coef.csqrt |
Matrix square-root of covariace matrix of |
n |
Number of data points, i.e., |
df |
Effective degrees of freedom (trace of smoothing matrix). |
xunique |
Unique elements of |
x |
Predictor vector (same as input). |
y |
Response vector (same as input). |
residuals |
Residual vector, i.e., |
knots |
Spline knots used for fit. |
monotone |
Logical (same as input). |
Warnings
When inputting user-specified knots
, all values in knots
must match a corresponding value in x
.
Note
The spline is estimated using penalized least-squares, which does not require the Gaussian error assumption. However, the spline inference information (e.g., standard errors and fit information) requires the Gaussian error assumption.
Author(s)
Nathaniel E. Helwig <helwig@umn.edu>
References
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.
Helwig, N. E. (2013). Fast and stable smoothing spline analysis of variance models for large samples with applications to electroencephalography data analysis. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign.
Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13.
Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.
Examples
########## EXAMPLE ##########
# generate some data
n <- 100
nk <- 50
x <- seq(-3,3,length.out=n)
eta <- (sin(2*x/pi) + 0.25*x^3 + 0.05*x^5)/15
set.seed(1)
y <- eta + rnorm(n, sd=0.5)
# plot data and true eta
plot(x, y)
lines(x, eta, col="blue", lwd=2)
# fit ordinal smoothing spline
ossmod <- ordspline(x, y, knots=nk)
lines(ossmod$x, ossmod$fit, col="red", lwd=2)
# fit monotonic smoothing spline
mssmod <- ordspline(x, y, knots=nk, monotone=TRUE)
lines(mssmod$x, mssmod$fit, col="purple", lwd=2)