R: Partial Linear Least-Squares with Constrained Regression...

conspline {ConSpline}

R Documentation

Partial Linear Least-Squares with Constrained Regression Splines

Description

Given a response variable y, a continuous predictor x, and a design matrix Z of parametrically modeled covariates, this function solves a least-squares regression assuming that y=f(x)+Zb+e, where f is a smooth function with a user-defined shape. The shape is assigned with the argument type, where 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8=decreasing and concave.

Usage

conspline(y,x,type,zmat=0,wt=0,knots=0,
   test=FALSE,c=1.2,nsim=10000)

Arguments

`y`	A continuous response variable
`x`	A continuous predictor variable. The length of x must equal the length of y.
`type`	An integer 1-8 describing the shape of the regression function in x. 1=increasing, 2=decreasing, 3=convex, 4=concave, 5=increasing and convex, 6=decreasing and convex, 7=increasing and concave, 8= decreasing and concave.
`zmat`	An optional design matrix of covariates to be modeled parametrically. The number of rows of zmat must be the length of y.
`wt`	Optional weight vector, must be positive and of the same length as y.
`knots`	Optional user-defined knots for the spline function. The range of the knots must contain the range of x.
`test`	If test=TRUE, a test for the "significance" of x is performed. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, for any of the other shapes, the null hypothesis is that the expected value of y is constant in x.
`c`	An optional parameter for the variance estimation. Must be between 1 and 2 inclusive.
`nsim`	An optional specification of the number of simulated data sets to make the mixing distribution for the test statistic if test=TRUE.

Details

A cone projection is used to fit the least-squares regression model. The test for the significance of x is exact, while the inference for the covariates represented by the Z columns uses statistics that have approximate t-distributions.

Value

`muhat`	The fitted values at the design points, i.e. an estimate of E(y).
`fhat`	The estimated regression function, evaluated at the x-values, describing the relationship between E(y) and x, see above description of the model.
`fslope`	The slope of fhat, evaluated at the x-values.
`knots`	The knots used in the spline function estimation.
`pvalx`	If test=TRUE, this is the p-value for the test involving the predictor x. For convex and concave shapes, the null hypothesis is that the relationship between y and x is linear, versus the alternative that it has the assigned shape. For any of the other shapes, the null hypothesis is that the expected value of y is constant in x, versus the assigned shape.
`zcoef`	The estimated coefficients for the components of the regression function given by the columns of Z. An "intercept" is given if the column space of Z did not contain the constant vectors.
`sighat`	The estimate of the model variance. Calculated as SSR/(n-cD), where SSR is the sum of squared residuals of the fit, n is the length of y, D is the observed degrees of freedom of the fit, and c is a parameter between 1 and 2.
`zhmat`	The hat matrix corresponding the columns of Z, to compute p-values for contrasts, for example.
`sez`	The standard errors for the Z coefficient estimates. These are square roots of the diagonal values of zhmat, times the square root of sighat.
`pvalz`	Approximate p-values for the null hypotheses that the coefficients for the covariates represented by the Z columns are zero.

Author(s)

Mary C Meyer, Professor, Statistics Department, Colorado State University

References

Meyer, M.C. (2008) Shape-Restricted Regression Splines, Annals of Applied Statistics, 2(3),1013-1033.

Examples

n=60
x=1:n/n
z=sample(0:1,n,replace=TRUE)
mu=1:n*0+4
mu[x>1/2]=4+5*(x[x>1/2]-1/2)^2
mu=mu+z/4
y=mu+rnorm(n)/4
plot(x,y,col=z+1)
ans=conspline(y,x,5,z,test=TRUE)
points(x,ans$muhat,pch=20,col=z+1)
lines(x,ans$fhat)
lines(x,ans$fhat+ans$zcoef, col=2)
ans$pvalz  ## p-val for test of significance of z parameter
ans$pvalx  ## p-val for test for linear vs convex regression function

[Package ConSpline version 1.2 Index]