bigspline {bigsplines} | R Documentation |
Fits Smoothing Spline
Description
Given a real-valued response vector \mathbf{y}=\{y_{i}\}_{n\times1}
and a real-valued predictor vector \mathbf{x}=\{x_{i}\}_{n\times 1}
with a \leq x_{i} \leq b \ \forall i
, a smoothing spline model has the form
y_{i}=\eta(x_{i})+e_{i}
where y_{i}
is the i
-th observation's respone, x_{i}
is the i
-th observation's predictor, \eta
is an unknown smooth function relating the response and predictor, and e_{i}\sim\mathrm{N}(0,\sigma^{2})
is iid Gaussian error.
Usage
bigspline(x,y,type="cub",nknots=30,rparm=0.01,xmin=min(x),
xmax=max(x),alpha=1,lambdas=NULL,se.fit=FALSE,
rseed=1234,knotcheck=TRUE)
Arguments
x |
Predictor vector. |
y |
Response vector. Must be same length as |
type |
Type of spline for |
nknots |
Scalar giving maximum number of knots to bin-sample. Use more knots for more jagged functions. |
rparm |
Rounding parameter for |
xmin |
Minimum |
xmax |
Maximum |
alpha |
Manual tuning parameter for GCV score. Using |
lambdas |
Vector of global smoothing parameters to try. Default estimates smoothing parameter that minimizes GCV score. |
se.fit |
Logical indicating if the standard errors of fitted values should be estimated. |
rseed |
Random seed. Input to |
knotcheck |
If |
Details
To estimate \eta
I minimize the penalized least-squares functional
\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\eta(x_{i}))^{2}+\lambda \int [\ddot{\eta}(x)]^2 dx
where \ddot{\eta}
denotes the second derivative of \eta
and \lambda\geq0
is a smoothing parameter that controls the trade-off between fitting and smoothing the data.
Default use of the function estimates \lambda
by minimizing the GCV score:
\mbox{GCV}(\lambda) = \frac{n\|(\mathbf{I}_{n}-\mathbf{S}_{\lambda})\mathbf{y}\|^{2}}{[n-\mathrm{tr}(\mathbf{S}_{\lambda})]^2}
where \mathbf{I}_{n}
is the identity matrix and \mathbf{S}_{\lambda}
is the smoothing matrix (see Computational Details).
Using the rounding parameter input rparm
can greatly speed-up and stabilize the fitting for large samples. When rparm
is used, the spline is fit to a set of unique data points after rounding; the unique points are determined using the efficient algorithm described in Helwig (2013). For typical cases, I recommend using rparm=0.01
, but smaller rounding parameters (e,g., rparm=0.001
) may be needed for particularly jagged functions (or when x
has outliers).
Value
fitted.values |
Vector of fitted values corresponding to the original data points in |
se.fit |
Vector of standard errors of |
x |
Predictor vector (same as input). |
y |
Response vector (same as input). |
type |
Type of spline that was used. |
xunique |
Unique elements of |
yunique |
Mean of |
funique |
Vector giving frequency of each element of |
sigma |
Estimated error standard deviation, i.e., |
ndf |
Data frame with two elements: |
info |
Model fit information: vector containing the GCV, multiple R-squared, AIC, and BIC of fit model (assuming Gaussian error). |
xrng |
Predictor range: |
myknots |
Bin-sampled spline knots used for fit. |
rparm |
Rounding parameter for |
lambda |
Optimal smoothing parameter. |
coef |
Spline basis function coefficients. |
coef.csqrt |
Matrix square-root of covariace matrix of |
Warnings
Cubic and cubic periodic splines transform the predictor to the interval [0,1] before fitting. So input xmin
must be less than or equal to min(x)
, and input xmax
must be greater than or equal to max(x)
.
When using rounding parameters, output fitted.values
corresponds to unique rounded predictor scores in output xunique
. Use predict.bigspline
function to get fitted values for full y
vector.
Computational Details
According to smoothing spline theory, the function \eta
can be approximated as
\eta(x) = d_{0} + d_{1}\phi_{1}(x) + \sum_{h=1}^{q}c_{h}\rho(x,x_{h}^{*})
where the \phi_{1}
is a linear function, \rho
is the reproducing kernel of the contrast (nonlinear) space, and \{x_{h}^{*}\}_{h=1}^{q}
are the selected spline knots.
This implies that the penalized least-squares functional can be rewritten as
\|\mathbf{y} - \mathbf{K}\mathbf{d} - \mathbf{J}\mathbf{c}\|^{2} + n\lambda\mathbf{c}'\mathbf{Q}\mathbf{c}
where \mathbf{K}=\{\phi(x_{i})\}_{n \times 2}
is the null space basis function matrix, \mathbf{J}=\{\rho(x_{i},x_{h}^{*})\}_{n \times q}
is the contrast space basis funciton matrix, \mathbf{Q}=\{\rho(x_{g}^{*},x_{h}^{*})\}_{q \times q}
is the penalty matrix, and \mathbf{d}=(d_{0},d_{1})'
and \mathbf{c}=(c_{1},\ldots,c_{q})'
are the unknown basis function coefficients.
Given the smoothing parameter \lambda
, the optimal basis function coefficients have the form
\left(\begin{array}{cc} \hat{\mathbf{d}} \\ \hat{\mathbf{c}} \end{array}\right) =
\left(\begin{array}{cc} \mathbf{K'K} & \mathbf{K}'\mathbf{J} \\
\mathbf{J}'\mathbf{K} & \mathbf{J}'\mathbf{J} + n\lambda\mathbf{Q} \end{array}\right)^{\dagger} \left(\begin{array}{c} \mathbf{K}' \\ \mathbf{J}' \end{array}\right)\mathbf{y}
where (\cdot)^{\dagger}
denotes the pseudoinverse of the input matrix.
Given the optimal coefficients, the fitted values are given by \hat{\mathbf{y}} = \mathbf{K}\hat{\mathbf{d}}+\mathbf{J}\hat{\mathbf{c}} = \mathbf{S}_{\lambda}\mathbf{y}
, where
\mathbf{S}_{\lambda} = \left(\begin{array}{cc} \mathbf{K} & \mathbf{J} \end{array}\right)
\left(\begin{array}{cc} \mathbf{K'K} & \mathbf{K}'\mathbf{J} \\
\mathbf{J}'\mathbf{K} & \mathbf{J}'\mathbf{J} + n\lambda\mathbf{Q} \end{array}\right)^{\dagger} \left(\begin{array}{c} \mathbf{K}' \\ \mathbf{J}' \end{array}\right)
is the smoothing matrix, which depends on \lambda
.
Spline Types
For a linear spline (type="lin"
) with x \in [0,1]
, the needed functions are
\phi_{1}(x) = 0 \qquad \mbox{and} \qquad \rho(x,z) = k_{1}(x)k_{1}(z)+k_{2}(|x-z|)
where k_{1}(x)=x-0.5
, k_{2}(x)=\frac{1}{2}\left(k_{1}^{2}(x) - \frac{1}{12} \right)
; in this case \mathbf{K}=\mathbf{1}_{n}
and \mathbf{d}=d_{0}
.
For a cubic spline (type="cub"
) with x \in [0,1]
, the needed functions are
\phi_{1}(x) = k_{1}(x) \qquad \mbox{and} \qquad \rho(x,z) = k_{2}(x)k_{2}(z)-k_{4}(|x-z|)
where k_{1}
and k_{2}
are defined above, and k_{4}(x)=\frac{1}{24}\left(k_{1}^{4}(x) - \frac{k_{1}^{2}(x)}{2} + \frac{7}{240} \right)
.
For a different cubic spline (type="cub0"
) with x \in [0,1]
, the needed functions are
\phi_{1}(x) = x \qquad \mbox{and} \qquad \rho(x,z) = (x \wedge z)^2[3(x \vee z) - (x \wedge z)]/6
where (x \wedge z) = \min(x,z)
and (x \vee z) = \max(x,z)
.
Note that type="cub"
and type="cub0"
use different definitions of the averaging operator in the null space. The overall spline estimates should be the same (up to approximation accuracy), but the null and constrast space effect functions will differ (see predict.bigspline
). See Helwig (2013) and Gu (2013) for a further discussion of polynomial splines.
For a periodic cubic spline (type="per"
) with x \in [0,1]
, the needed functions are
\phi_{1}(x) = 0 \qquad \mbox{and} \qquad \rho(x,z) = -k_{4}(|x-z|)
where k_{4}(x)
is defined as it was for type="cub"
; in this case \mathbf{K}=\mathbf{1}_{n}
and \mathbf{d}=d_{0}
.
Note
The spline is estimated using penalized least-squares, which does not require the Gaussian error assumption. However, the spline inference information (e.g., standard errors and fit information) requires the Gaussian error assumption.
Author(s)
Nathaniel E. Helwig <helwig@umn.edu>
References
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.
Helwig, N. E. (2013). Fast and stable smoothing spline analysis of variance models for large samples with applications to electroencephalography data analysis. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign.
Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.
Helwig, N. E. and Ma, P. (2016). Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters. Statistics and Its Interface, 9, 433-444.
Examples
########## EXAMPLE 1 ##########
# define relatively smooth function
set.seed(773)
myfun <- function(x){ sin(2*pi*x) }
x <- runif(10^6)
y <- myfun(x) + rnorm(10^6)
# linear, cubic, different cubic, and periodic splines
linmod <- bigspline(x,y,type="lin")
linmod
cubmod <- bigspline(x,y)
cubmod
cub0mod <- bigspline(x,y,type="cub0")
cub0mod
permod <- bigspline(x,y,type="per")
permod
########## EXAMPLE 2 ##########
# define more jagged function
set.seed(773)
myfun <- function(x){ 2*x + cos(4*pi*x) }
x <- runif(10^6)*4
y <- myfun(x) + rnorm(10^6)
# try different numbers of knots
r1mod <- bigspline(x,y,nknots=20)
crossprod( myfun(r1mod$xunique) - r1mod$fitted )/length(r1mod$fitted)
r2mod <- bigspline(x,y,nknots=30)
crossprod( myfun(r2mod$xunique) - r2mod$fitted )/length(r2mod$fitted)
r3mod <- bigspline(x,y,nknots=40)
crossprod( myfun(r3mod$xunique) - r3mod$fitted )/length(r3mod$fitted)
########## EXAMPLE 3 ##########
# define more jagged function
set.seed(773)
myfun <- function(x){ 2*x + cos(4*pi*x) }
x <- runif(10^6)*4
y <- myfun(x) + rnorm(10^6)
# try different rounding parameters
r1mod <- bigspline(x,y,rparm=0.05)
crossprod( myfun(r1mod$xunique) - r1mod$fitted )/length(r1mod$fitted)
r2mod <- bigspline(x,y,rparm=0.02)
crossprod( myfun(r2mod$xunique) - r2mod$fitted )/length(r2mod$fitted)
r3mod <- bigspline(x,y,rparm=0.01)
crossprod( myfun(r3mod$xunique) - r3mod$fitted )/length(r3mod$fitted)