R: Fits Ordinal Smoothing Spline

ordspline {bigsplines}

R Documentation

Fits Ordinal Smoothing Spline

Description

Given a real-valued response vector \mathbf{y}=\{y_{i}\}_{n\times1} and an ordinal predictor vector \mathbf{x}=\{x_{i}\}_{n\times 1} with x_{i} \in \{1,\ldots,K\} \ \forall i, an ordinal smoothing spline model has the form

y_{i}=\eta(x_{i})+e_{i}

where y_{i} is the i-th observation's respone, x_{i} is the i-th observation's predictor, \eta is an unknown function relating the response and predictor, and e_{i}\sim\mathrm{N}(0,\sigma^{2}) is iid Gaussian error.

Usage

ordspline(x, y, knots, weights, lambda, monotone=FALSE)

Arguments

`x`	Predictor vector.
`y`	Response vector. Must be same length as `x`.
`knots`	Either a scalar giving the number of equidistant knots to use, or a vector of values to use as the spline knots. If left blank, the number of knots is `min(50, nu)` where `nu = length(unique(x)).`
`weights`	Weights vector (for weighted penalized least squares). Must be same length as `x` and contain non-negative values.
`lambda`	Smoothing parameter. If left blank, `lambda` is tuned via Generalized Cross-Validation.
`monotone`	If `TRUE`, the relationship between `x` and `y` is constrained to be monotonic increasing.

Details

To estimate \eta I minimize the penalized least-squares functional

\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\eta(x_{i}))^{2}+\lambda \sum_{x=2}^K [\eta(x)-\eta(x-1)]^2 dx

where \lambda\geq0 is a smoothing parameter that controls the trade-off between fitting and smoothing the data.

Default use of the function estimates \lambda by minimizing the GCV score:

\mbox{GCV}(\lambda) = \frac{n\|(\mathbf{I}_{n}-\mathbf{S}_{\lambda})\mathbf{y}\|^{2}}{[n-\mathrm{tr}(\mathbf{S}_{\lambda})]^2}

where \mathbf{I}_{n} is the identity matrix and \mathbf{S}_{\lambda} is the smoothing matrix.

Value

`fitted.values`	Vector of fitted values.
`se.fit`	Vector of standard errors of `fitted.values`.
`sigma`	Estimated error standard deviation, i.e., `\hat{\sigma}`.
`lambda`	Chosen smoothing parameter.
`info`	Model fit information: vector containing the GCV, R-squared, AIC, and BIC of fit model (assuming Gaussian error).
`coef`	Spline basis function coefficients.
`coef.csqrt`	Matrix square-root of covariace matrix of `coef`. Use `tcrossprod(coef.csqrt)` to get covariance matrix of `coef`.
`n`	Number of data points, i.e., `length(x)`.
`df`	Effective degrees of freedom (trace of smoothing matrix).
`xunique`	Unique elements of `x`.
`x`	Predictor vector (same as input).
`y`	Response vector (same as input).
`residuals`	Residual vector, i.e., `y - fitted.values`.
`knots`	Spline knots used for fit.
`monotone`	Logical (same as input).

Warnings

When inputting user-specified knots, all values in knots must match a corresponding value in x.

Note

The spline is estimated using penalized least-squares, which does not require the Gaussian error assumption. However, the spline inference information (e.g., standard errors and fit information) requires the Gaussian error assumption.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.

Helwig, N. E. (2013). Fast and stable smoothing spline analysis of variance models for large samples with applications to electroencephalography data analysis. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign.

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13.

Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.

Examples


##########   EXAMPLE   ##########

# generate some data
n <- 100
nk <- 50
x <- seq(-3,3,length.out=n)
eta <- (sin(2*x/pi) + 0.25*x^3 + 0.05*x^5)/15
set.seed(1)
y <- eta + rnorm(n, sd=0.5)

# plot data and true eta
plot(x, y)
lines(x, eta, col="blue", lwd=2)

# fit ordinal smoothing spline
ossmod <- ordspline(x, y, knots=nk)
lines(ossmod$x, ossmod$fit, col="red", lwd=2)

# fit monotonic smoothing spline
mssmod <- ordspline(x, y, knots=nk, monotone=TRUE)
lines(mssmod$x, mssmod$fit, col="purple", lwd=2)

[Package bigsplines version 1.1-1 Index]