R: Predict method for Generalized Smooth Model Fits

predict.gsm {npreg}

R Documentation

Predict method for Generalized Smooth Model Fits

Description

predict method for class "gsm".

Usage

## S3 method for class 'gsm'
predict(object, newdata = NULL, se.fit = FALSE, 
        type = c("link", "response", "terms"), 
        terms = NULL, na.action = na.pass,
        intercept = NULL, combine = TRUE, design = FALSE, 
        check.newdata = TRUE, ...)

Arguments

`object`	a fit from `gsm`.
`newdata`	an optional list or data frame in which to look for variables with which to predict. If omitted, the original data are used.
`se.fit`	a switch indicating if standard errors are required.
`type`	type of prediction (link, response, or model term). Can be abbreviated.
`terms`	which terms to include in the fit. The default of `NULL` uses all terms. This input is used regardless of the `type` of prediction.
`na.action`	function determining what should be done with missing values in `newdata`. The default is to predict `NA`.
`intercept`	a switch indicating if the intercept should be included in the prediction. If `NULL` (default), the intercept is included in the fit only when `type = "r"` and `terms` includes all model terms.
`combine`	a switch indicating if the parametric and smooth components of the prediction should be combined (default) or returned separately.
`design`	a switch indicating if the model (design) matrix for the prediction should be returned.
`check.newdata`	a switch indicating if the `newdata` should be checked for consistency (e.g., class and range). Ignored if `newdata` is not provided.
`...`	additional arguments affecting the prediction produced (currently ignored).

Details

Inspired by the predict.glm function in R's stats package.

Produces predicted values, obtained by evaluating the regression function in the frame newdata (which defaults to model.frame(object)). If the logical se.fit is TRUE, standard errors of the predictions are calculated.

If newdata is omitted the predictions are based on the data used for the fit. Regardless of the newdata argument, how cases with missing values are handled is determined by the na.action argument. If na.action = na.omit omitted cases will not appear in the predictions, whereas if na.action = na.exclude they will appear (in predictions and standard errors), with value NA.

Similar to the glm function, setting type = "terms" returns a matrix giving the predictions for each of the requested model terms. Unlike the glm function, this function allows for predictions using any subset of the model terms. Specifically, the predictions (on both the link and response scale) will only include the requested terms, which makes it possible to obtain estimates (and standard errors) for subsets of model terms. In this case, the newdata only needs to contain data for the subset of variables that are requested in terms.

Value

Default use returns a vector of predictions. Otherwise the form of the output will depend on the combination of argumments: se.fit, type, combine, and design.

type = "link":
When se.fit = FALSE and design = FALSE, the output will be the predictions on the link scale. When se.fit = TRUE or design = TRUE, the output is a list with components fit, se.fit (if requested), and X (if requested).

type = "response":
When se.fit = FALSE and design = FALSE, the output will be the predictions on the data scale. When se.fit = TRUE or design = TRUE, the output is a list with components fit, se.fit (if requested), and X (if requested).

type = "terms":
When se.fit = FALSE and design = FALSE, the output will be the predictions for each term on the link scale. When se.fit = TRUE or design = TRUE, the output is a list with components fit, se.fit (if requested), and X (if requested).

Regardless of the type, setting combine = FALSE decomposes the requested result(s) into the parametric and smooth contributions.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.glm.html

Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. doi:10.1007/BF01404567

Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. doi:10.1007/978-1-4614-5369-7

Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi:10.4135/9781526421036885885

Examples

# generate data
set.seed(1)
n <- 1000
x <- seq(0, 1, length.out = n)
z <- factor(sample(letters[1:3], size = n, replace = TRUE))
fun <- function(x, z){
  mu <- c(-2, 0, 2)
  zi <- as.integer(z)
  fx <- mu[zi] + 3 * x + sin(2 * pi * x + mu[zi]*pi/4)
}
fx <- fun(x, z)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))

# define marginal knots
probs <- seq(0, 0.9, by = 0.1)
knots <- list(x = quantile(x, probs = probs),
              z = letters[1:3])

# fit gsm with specified knots (tprk = TRUE)
gsm.ssa <- gsm(y ~ x * z, family = binomial, knots = knots)
pred <- predict(gsm.ssa)
term <- predict(gsm.ssa, type = "terms")
mean((gsm.ssa$linear.predictors - pred)^2)
mean((gsm.ssa$linear.predictors - rowSums(term) - attr(term, "constant"))^2)

# fit gsm with specified knots (tprk = FALSE)
gsm.gam <- gsm(y ~ x * z, family = binomial, knots = knots, tprk = FALSE)
pred <- predict(gsm.gam)
term <- predict(gsm.gam, type = "terms")
mean((gsm.gam$linear.predictors - pred)^2)
mean((gsm.gam$linear.predictors - rowSums(term) - attr(term, "constant"))^2)

[Package npreg version 1.1.0 Index]