frscv {crs} | R Documentation |
Categorical Factor Regression Spline Cross-Validation
Description
frscv
computes exhaustive cross-validation directed search for
a regression spline estimate of a one (1) dimensional dependent
variable on an r
-dimensional vector of continuous predictors
and nominal/ordinal (factor
/ordered
)
predictors.
Usage
frscv(xz,
y,
degree.max = 10,
segments.max = 10,
degree.min = 0,
segments.min = 1,
complexity = c("degree-knots","degree","knots"),
knots = c("quantiles","uniform","auto"),
basis = c("additive","tensor","glp","auto"),
cv.func = c("cv.ls","cv.gcv","cv.aic"),
degree = degree,
segments = segments,
tau = NULL,
weights = NULL,
singular.ok = FALSE)
Arguments
y |
continuous univariate vector |
xz |
continuous and/or nominal/ordinal
( |
degree.max |
the maximum degree of the B-spline basis for
each of the continuous predictors (default |
segments.max |
the maximum segments of the B-spline basis for
each of the continuous predictors (default |
degree.min |
the minimum degree of the B-spline basis for
each of the continuous predictors (default |
segments.min |
the minimum segments of the B-spline basis for
each of the continuous predictors (default |
complexity |
a character string (default
|
knots |
a character string (default |
basis |
a character string (default |
cv.func |
a character string (default |
degree |
integer/vector specifying the degree of the B-spline
basis for each dimension of the continuous |
segments |
integer/vector specifying the number of segments of
the B-spline basis for each dimension of the continuous |
tau |
if non-null a number in (0,1) denoting the quantile for which a quantile
regression spline is to be estimated rather than estimating the
conditional mean (default |
weights |
an optional vector of weights to be used in the fitting process. Should be ‘NULL’ or a numeric vector. If non-NULL, weighted least squares is used with weights ‘weights’ (that is, minimizing ‘sum(w*e^2)’); otherwise ordinary least squares is used. |
singular.ok |
a logical value (default |
Details
frscv
computes exhaustive cross-validation for a regression
spline estimate of a one (1) dimensional dependent variable on an
r
-dimensional vector of continuous and nominal/ordinal
(factor
/ordered
) predictors. The optimal
K
/I
combination (i.e.\
degree
/segments
/I
) is returned along with other
results (see below for return values).
For the continuous predictors the regression spline model employs
either the additive or tensor product B-spline basis matrix for a
multivariate polynomial spline via the B-spline routines in the GNU
Scientific Library (https://www.gnu.org/software/gsl/) and the
tensor.prod.model.matrix
function.
For the nominal/ordinal (factor
/ordered
)
predictors the regression spline model uses indicator basis functions.
Value
frscv
returns a crscv
object. Furthermore, the function
summary
supports objects of this type. The returned
objects have the following components:
K |
scalar/vector containing optimal degree(s) of spline or number of segments |
I |
scalar/vector containing an indicator of whether the
predictor is included or not for each dimension of the
nominal/ordinal ( |
K.mat |
vector/matrix of values of |
cv.func |
objective function value at optimum |
cv.func.vec |
vector of objective function values at each degree
of spline or number of segments in |
Author(s)
Jeffrey S. Racine racinej@mcmaster.ca
References
Craven, P. and G. Wahba (1979), “Smoothing Noisy Data With Spline Functions,” Numerische Mathematik, 13, 377-403.
Hurvich, C.M. and J.S. Simonoff and C.L. Tsai (1998), “Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion,” Journal of the Royal Statistical Society B, 60, 271-293.
Li, Q. and J.S. Racine (2007), Nonparametric Econometrics: Theory and Practice, Princeton University Press.
Ma, S. and J.S. Racine and L. Yang (2015), “Spline Regression in the Presence of Categorical Predictors,” Journal of Applied Econometrics, Volume 30, 705-717.
Ma, S. and J.S. Racine (2013), “Additive Regression Splines with Irrelevant Categorical and Continuous Regressors,” Statistica Sinica, Volume 23, 515-541.
See Also
Examples
set.seed(42)
## Simulated data
n <- 1000
x <- runif(n)
z <- round(runif(n,min=-0.5,max=1.5))
z.unique <- uniquecombs(as.matrix(z))
ind <- attr(z.unique,"index")
ind.vals <- sort(unique(ind))
dgp <- numeric(length=n)
for(i in 1:nrow(z.unique)) {
zz <- ind == ind.vals[i]
dgp[zz] <- z[zz]+cos(2*pi*x[zz])
}
y <- dgp + rnorm(n,sd=.1)
xdata <- data.frame(x,z=factor(z))
## Compute the optimal K and I, determine optimal number of knots, set
## spline degree for x to 3
cv <- frscv(x=xdata,y=y,complexity="knots",degree=c(3))
summary(cv)