| conspline {ConSpline} | R Documentation | 
Partial Linear Least-Squares with Constrained Regression Splines
Description
Given a response variable y, a continuous predictor x, and a design matrix Z of parametrically modeled covariates, this function solves a least-squares regression assuming that y=f(x)+Zb+e, where f is a smooth function with a user-defined shape. The shape is assigned with the argument type, where 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8=decreasing and concave.
Usage
conspline(y,x,type,zmat=0,wt=0,knots=0,
   test=FALSE,c=1.2,nsim=10000)
Arguments
| y | A continuous response variable | 
| x | A continuous predictor variable. The length of x must equal the length of y. | 
| type | An integer 1-8 describing the shape of the regression function in x. 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8= decreasing and concave. | 
| zmat | An optional design matrix of covariates to be modeled parametrically. The number of rows of zmat must be the length of y. | 
| wt | Optional weight vector, must be positive and of the same length as y. | 
| knots | Optional user-defined knots for the spline function. The range of the knots must contain the range of x. | 
| test | If test=TRUE, a test for the "significance" of x is performed. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, for any of the other shapes, the null hypothesis is that the expected value of y is constant in x. | 
| c | An optional parameter for the variance estimation. Must be between 1 and 2 inclusive. | 
| nsim | An optional specification of the number of simulated data sets to make the mixing distribution for the test statistic if test=TRUE. | 
Details
A cone projection is used to fit the least-squares regression model. The test for the significance of x is exact, while the inference for the covariates represented by the Z columns uses statistics that have approximate t-distributions.
Value
| muhat | The fitted values at the design points, i.e. an estimate of E(y). | 
| fhat | The estimated regression function, evaluated at the x-values, describing the relationship between E(y) and x, see above description of the model. | 
| fslope | The slope of fhat, evaluated at the x-values. | 
| knots | The knots used in the spline function estimation. | 
| pvalx | If test=TRUE, this is the p-value for the test involving the predictor x. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, versus the alternative that it has the assigned shape. For any of the other shapes, the null hypothesis is that the expected value of y is constant in x, versus the assigned shape. | 
| zcoef | The estimated coefficients for the components of the regression function given by the columns of Z. An "intercept" is given if the column space of Z did not contain the constant vectors. | 
| sighat | The estimate of the model variance. Calculated as SSR/(n-cD), where SSR is the sum of squared residuals of the fit, n is the length of y, D is the observed degrees of freedom of the fit, and c is a parameter between 1 and 2. | 
| zhmat | The hat matrix corresponding the columns of Z, to compute p-values for contrasts, for example. | 
| sez | The standard errors for the Z coefficient estimates. These are square roots of the diagonal values of zhmat, times the square root of sighat. | 
| pvalz | Approximate p-values for the null hypotheses that the coefficients for the covariates represented by the Z columns are zero. | 
Author(s)
Mary C Meyer, Professor, Statistics Department, Colorado State University
References
Meyer, M.C. (2008) Shape-Restricted Regression Splines, Annals of Applied Statistics, 2(3),1013-1033.
Examples
n=60
x=1:n/n
z=sample(0:1,n,replace=TRUE)
mu=1:n*0+4
mu[x>1/2]=4+5*(x[x>1/2]-1/2)^2
mu=mu+z/4
y=mu+rnorm(n)/4
plot(x,y,col=z+1)
ans=conspline(y,x,5,z,test=TRUE)
points(x,ans$muhat,pch=20,col=z+1)
lines(x,ans$fhat)
lines(x,ans$fhat+ans$zcoef, col=2)
ans$pvalz  ## p-val for test of significance of z parameter
ans$pvalx  ## p-val for test for linear vs convex regression function