predict.gsm {npreg} | R Documentation |
Predict method for Generalized Smooth Model Fits
Description
predict
method for class "gsm".
Usage
## S3 method for class 'gsm'
predict(object, newdata = NULL, se.fit = FALSE,
type = c("link", "response", "terms"),
terms = NULL, na.action = na.pass,
intercept = NULL, combine = TRUE, design = FALSE,
check.newdata = TRUE, ...)
Arguments
object |
a fit from |
newdata |
an optional list or data frame in which to look for variables with which to predict. If omitted, the original data are used. |
se.fit |
a switch indicating if standard errors are required. |
type |
type of prediction (link, response, or model term). Can be abbreviated. |
terms |
which terms to include in the fit. The default of |
na.action |
function determining what should be done with missing values in |
intercept |
a switch indicating if the intercept should be included in the prediction. If |
combine |
a switch indicating if the parametric and smooth components of the prediction should be combined (default) or returned separately. |
design |
a switch indicating if the model (design) matrix for the prediction should be returned. |
check.newdata |
a switch indicating if the |
... |
additional arguments affecting the prediction produced (currently ignored). |
Details
Inspired by the predict.glm
function in R's stats package.
Produces predicted values, obtained by evaluating the regression function in the frame newdata
(which defaults to model.frame(object)
). If the logical se.fit
is TRUE
, standard errors of the predictions are calculated.
If newdata
is omitted the predictions are based on the data used for the fit. Regardless of the newdata
argument, how cases with missing values are handled is determined by the na.action
argument. If na.action = na.omit
omitted cases will not appear in the predictions, whereas if na.action = na.exclude
they will appear (in predictions and standard errors), with value NA
.
Similar to the glm
function, setting type = "terms"
returns a matrix giving the predictions for each of the requested model terms
. Unlike the glm
function, this function allows for predictions using any subset of the model terms. Specifically, the predictions (on both the link
and response
scale) will only include the requested terms
, which makes it possible to obtain estimates (and standard errors) for subsets of model terms. In this case, the newdata
only needs to contain data for the subset of variables that are requested in terms
.
Value
Default use returns a vector of predictions. Otherwise the form of the output will depend on the combination of argumments: se.fit
, type
, combine
, and design
.
type = "link"
:
When se.fit = FALSE
and design = FALSE
, the output will be the predictions on the link scale. When se.fit = TRUE
or design = TRUE
, the output is a list with components fit
, se.fit
(if requested), and X
(if requested).
type = "response"
:
When se.fit = FALSE
and design = FALSE
, the output will be the predictions on the data scale. When se.fit = TRUE
or design = TRUE
, the output is a list with components fit
, se.fit
(if requested), and X
(if requested).
type = "terms"
:
When se.fit = FALSE
and design = FALSE
, the output will be the predictions for each term on the link scale. When se.fit = TRUE
or design = TRUE
, the output is a list with components fit
, se.fit
(if requested), and X
(if requested).
Regardless of the type
, setting combine = FALSE
decomposes the requested result(s) into the parametric and smooth contributions.
Author(s)
Nathaniel E. Helwig <helwig@umn.edu>
References
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.glm.html
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. doi:10.1007/BF01404567
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. doi:10.1007/978-1-4614-5369-7
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. doi:10.4135/9781526421036885885
See Also
Examples
# generate data
set.seed(1)
n <- 1000
x <- seq(0, 1, length.out = n)
z <- factor(sample(letters[1:3], size = n, replace = TRUE))
fun <- function(x, z){
mu <- c(-2, 0, 2)
zi <- as.integer(z)
fx <- mu[zi] + 3 * x + sin(2 * pi * x + mu[zi]*pi/4)
}
fx <- fun(x, z)
y <- rbinom(n = n, size = 1, p = 1 / (1 + exp(-fx)))
# define marginal knots
probs <- seq(0, 0.9, by = 0.1)
knots <- list(x = quantile(x, probs = probs),
z = letters[1:3])
# fit gsm with specified knots (tprk = TRUE)
gsm.ssa <- gsm(y ~ x * z, family = binomial, knots = knots)
pred <- predict(gsm.ssa)
term <- predict(gsm.ssa, type = "terms")
mean((gsm.ssa$linear.predictors - pred)^2)
mean((gsm.ssa$linear.predictors - rowSums(term) - attr(term, "constant"))^2)
# fit gsm with specified knots (tprk = FALSE)
gsm.gam <- gsm(y ~ x * z, family = binomial, knots = knots, tprk = FALSE)
pred <- predict(gsm.gam)
term <- predict(gsm.gam, type = "terms")
mean((gsm.gam$linear.predictors - pred)^2)
mean((gsm.gam$linear.predictors - rowSums(term) - attr(term, "constant"))^2)