R: Predicts for "bigssa" Objects

predict.bigssa {bigsplines}

R Documentation

Predicts for "bigssa" Objects

Description

Get fitted values and standard error estimates for smoothing spline anova models.

Usage

## S3 method for class 'bigssa'
predict(object,newdata=NULL,se.fit=FALSE,include=object$tnames,
        effect=c("all","0","lin","non"),includeint=FALSE,
        design=FALSE,smoothMatrix=FALSE,intercept=NULL,...)

Arguments

`object`	Object of class "bigssa", which is output from `bigssa`.
`newdata`	Data frame or list containing the new data points for prediction. Variable names must match those used in the `formula` input of `bigssa`. See Details and Example. Default of `newdata=NULL` uses original data in `object` input.
`se.fit`	Logical indicating whether the standard errors of the fitted values should be estimated. Default is `se.fit=FALSE`.
`include`	Which terms to include in the estimate. You can get fitted values for any combination of terms in the `tnames` element of an "bigssa" object.
`effect`	Which effect to estimate: `effect="all"` gives `\hat{y}` for given terms in `include`, `effect="lin"` gives linear portion of `\hat{y}` for given terms in `include`, and `effect="non"` gives nonlinear portion of `\hat{y}` for given terms in `include`. Use `effect="0"` to return the intercept.
`includeint`	Logical indicating whether the intercept should be included in the prediction. If `include=object$tnames` and `effect="all"` (default), then this input is ignored and the intercept is automatically included in the prediction.
`design`	Logical indicating whether the design matrix should be returned.
`smoothMatrix`	Logical indicating whether the smoothing matrix should be returned.
`intercept`	Logical indicating whether the intercept should be included in the prediction. When used, this input overrides the `includeint` input.
`...`	Ignored.

Details

Uses the coefficient and smoothing parameter estimates from a fit smoothing spline anova (estimated by bigssa) to predict for new data.

Value

If se.fit=FALSE, design=FALSE, and smoothMatrix=FALSE, returns vector of fitted values.

Otherwise returns list with elements:

`fit`	Vector of fitted values
`se.fit`	Vector of standard errors of fitted values (if `se.fit=TRUE`)
`X`	Design matrix used to create fitted values (if `design=TRUE`)
`ix`	Index vector such that `fit=X%*%object$modelspec$coef[ix]` (if `design=TRUE`)
`S`	Smoothing matrix corresponding to fitted values (if `smoothMatrix=TRUE`)

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer.

Helwig, N. E. (2013). Fast and stable smoothing spline analysis of variance models for large samples with applications to electroencephalography data analysis. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign.

Helwig, N. E. (2016). Efficient estimation of variance components in nonparametric mixed-effects models with large samples. Statistics and Computing, 26, 1319-1336.

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13.

Helwig, N. E. and Ma, P. (2015). Fast and stable multiple smoothing parameter selection in smoothing spline analysis of variance models with large samples. Journal of Computational and Graphical Statistics, 24, 715-732.

Helwig, N. E. and Ma, P. (2016). Smoothing spline ANOVA for super-large samples: Scalable computation via rounding parameters. Statistics and Its Interface, 9, 433-444.

Examples


##########   EXAMPLE 1   ##########

# define univariate function and data
set.seed(773)
myfun <- function(x){ 2 + x + sin(2*pi*x) }
x <- runif(500)
y <- myfun(x) + rnorm(500)

# fit cubic spline model
cubmod <- bigssa(y~x,type="cub",nknots=30)
crossprod( predict(cubmod) - myfun(x) )/500

# define new data for prediction
newdata <- data.frame(x=seq(0,1,length.out=100))

# get fitted values and standard errors for new data
yc <- predict(cubmod,newdata,se.fit=TRUE)

# plot results with 95% Bayesian confidence interval
plot(newdata$x,yc$fit,type="l")
lines(newdata$x,yc$fit+qnorm(.975)*yc$se.fit,lty=3)
lines(newdata$x,yc$fit-qnorm(.975)*yc$se.fit,lty=3)

# predict constant, linear, and nonlinear effects
yc0 <- predict(cubmod,newdata,se.fit=TRUE,effect="0")
ycl <- predict(cubmod,newdata,se.fit=TRUE,effect="lin")
ycn <- predict(cubmod,newdata,se.fit=TRUE,effect="non")
crossprod( yc$fit - (yc0$fit + ycl$fit + ycn$fit) )

# plot results with 95% Bayesian confidence intervals
par(mfrow=c(1,2))
plot(newdata$x,ycl$fit,type="l",main="Linear effect")
lines(newdata$x,ycl$fit+qnorm(.975)*ycl$se.fit,lty=3)
lines(newdata$x,ycl$fit-qnorm(.975)*ycl$se.fit,lty=3)
plot(newdata$x,ycn$fit,type="l",main="Nonlinear effect")
lines(newdata$x,ycn$fit+qnorm(.975)*ycn$se.fit,lty=3)
lines(newdata$x,ycn$fit-qnorm(.975)*ycn$se.fit,lty=3)
         
         
##########   EXAMPLE 2   ##########

# define bivariate function and data
set.seed(773)
myfun<-function(x){
  2 + x[,1]/10 - x[,2]/5 + 2*sin(sqrt(x[,1]^2+x[,2]^2+.1))/sqrt(x[,1]^2+x[,2]^2+.1)
}
x1v <- runif(500)*16-8
x2v <- runif(500)*16-8
y <- myfun(cbind(x1v,x2v)) + rnorm(500)

# tensor product cubic splines with 50 knots
cubmod <- bigssa(y~x1v*x2v,type=list(x1v="cub",x2v="cub"),nknots=50)
crossprod( predict(cubmod) - myfun(cbind(x1v,x2v)) )/500

# define new data for prediction
xnew <- as.matrix(expand.grid(seq(-8,8,l=50),seq(-8,8,l=50)))
newdata <- list(x1v=xnew[,1],x2v=xnew[,2])

# get fitted values for new data
yp <- predict(cubmod,newdata)

# plot results
imagebar(seq(-8,8,l=50),seq(-8,8,l=50),matrix(yp,50,50),
         xlab=expression(italic(x)[1]),ylab=expression(italic(x)[2]),
         zlab=expression(hat(italic(y))))

# predict linear and nonlinear effects for x1v
newdata <- list(x1v=seq(-8,8,length.out=100))
yl <- predict(cubmod,newdata,include="x1v",effect="lin",se.fit=TRUE)
yn <- predict(cubmod,newdata,include="x1v",effect="non",se.fit=TRUE)

# plot results with 95% Bayesian confidence intervals
par(mfrow=c(1,2))
plot(newdata$x1v,yl$fit,type="l",main="Linear effect")
lines(newdata$x1v,yl$fit+qnorm(.975)*yl$se.fit,lty=3)
lines(newdata$x1v,yl$fit-qnorm(.975)*yl$se.fit,lty=3)
plot(newdata$x1v,yn$fit,type="l",main="Nonlinear effect",ylim=c(-.3,.4))
lines(newdata$x1v,yn$fit+qnorm(.975)*yn$se.fit,lty=3)
lines(newdata$x1v,yn$fit-qnorm(.975)*yn$se.fit,lty=3)

[Package bigsplines version 1.1-1 Index]