conspline {ConSpline} | R Documentation |
Partial Linear Least-Squares with Constrained Regression Splines
Description
Given a response variable y, a continuous predictor x, and a design matrix Z of parametrically modeled covariates, this function solves a least-squares regression assuming that y=f(x)+Zb+e, where f is a smooth function with a user-defined shape. The shape is assigned with the argument type, where 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8=decreasing and concave.
Usage
conspline(y,x,type,zmat=0,wt=0,knots=0,
test=FALSE,c=1.2,nsim=10000)
Arguments
y |
A continuous response variable |
x |
A continuous predictor variable. The length of x must equal the length of y. |
type |
An integer 1-8 describing the shape of the regression function in x. 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8= decreasing and concave. |
zmat |
An optional design matrix of covariates to be modeled parametrically. The number of rows of zmat must be the length of y. |
wt |
Optional weight vector, must be positive and of the same length as y. |
knots |
Optional user-defined knots for the spline function. The range of the knots must contain the range of x. |
test |
If test=TRUE, a test for the "significance" of x is performed. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, for any of the other shapes, the null hypothesis is that the expected value of y is constant in x. |
c |
An optional parameter for the variance estimation. Must be between 1 and 2 inclusive. |
nsim |
An optional specification of the number of simulated data sets to make the mixing distribution for the test statistic if test=TRUE. |
Details
A cone projection is used to fit the least-squares regression model. The test for the significance of x is exact, while the inference for the covariates represented by the Z columns uses statistics that have approximate t-distributions.
Value
muhat |
The fitted values at the design points, i.e. an estimate of E(y). |
fhat |
The estimated regression function, evaluated at the x-values, describing the relationship between E(y) and x, see above description of the model. |
fslope |
The slope of fhat, evaluated at the x-values. |
knots |
The knots used in the spline function estimation. |
pvalx |
If test=TRUE, this is the p-value for the test involving the predictor x. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, versus the alternative that it has the assigned shape. For any of the other shapes, the null hypothesis is that the expected value of y is constant in x, versus the assigned shape. |
zcoef |
The estimated coefficients for the components of the regression function given by the columns of Z. An "intercept" is given if the column space of Z did not contain the constant vectors. |
sighat |
The estimate of the model variance. Calculated as SSR/(n-cD), where SSR is the sum of squared residuals of the fit, n is the length of y, D is the observed degrees of freedom of the fit, and c is a parameter between 1 and 2. |
zhmat |
The hat matrix corresponding the columns of Z, to compute p-values for contrasts, for example. |
sez |
The standard errors for the Z coefficient estimates. These are square roots of the diagonal values of zhmat, times the square root of sighat. |
pvalz |
Approximate p-values for the null hypotheses that the coefficients for the covariates represented by the Z columns are zero. |
Author(s)
Mary C Meyer, Professor, Statistics Department, Colorado State University
References
Meyer, M.C. (2008) Shape-Restricted Regression Splines, Annals of Applied Statistics, 2(3),1013-1033.
Examples
n=60
x=1:n/n
z=sample(0:1,n,replace=TRUE)
mu=1:n*0+4
mu[x>1/2]=4+5*(x[x>1/2]-1/2)^2
mu=mu+z/4
y=mu+rnorm(n)/4
plot(x,y,col=z+1)
ans=conspline(y,x,5,z,test=TRUE)
points(x,ans$muhat,pch=20,col=z+1)
lines(x,ans$fhat)
lines(x,ans$fhat+ans$zcoef, col=2)
ans$pvalz ## p-val for test of significance of z parameter
ans$pvalx ## p-val for test for linear vs convex regression function