NGeDSgam {GeDS} | R Documentation |
NGeDSgam: Local Scoring Algorithm with GeD Splines in Backfitting
Description
Implements the Local Scoring Algorithm (Hastie and Tibshirani
(1986)), applying normal GeD splines (i.e., NGeDS
function) to
fit the targets within the backfitting iterations.
Usage
NGeDSgam(
formula,
family = "gaussian",
data,
weights = NULL,
offset = NULL,
normalize_data = FALSE,
min_iterations,
max_iterations,
phi_gam_exit = 0.995,
q_gam = 2,
beta = 0.5,
phi = 0.99,
internal_knots = 500,
q = 2,
higher_order = TRUE
)
Arguments
formula |
a description of the structure of the model to be fitted,
including the dependent and independent variables. Unlike |
family |
a character string indicating the response variable distribution
and link function to be used. Default is |
data |
a data frame containing the variables referenced in the formula. |
weights |
an optional vector of ‘prior weights’ to be put on the
observations during the fitting process. It should be |
offset |
a vector of size |
normalize_data |
a logical that defines whether the data should be
normalized (standardized) before fitting the baseline linear model, i.e.,
before running the local-scoring algorithm. Normalizing the data involves
scaling the predictor variables to have a mean of 0 and a standard deviation
of 1. This process alters the scale and interpretation of the knots and
coefficients estimated. Default is equal to |
min_iterations |
optional parameter to manually set a minimum number of boosting iterations to be run. If not specified, it defaults to 0L. |
max_iterations |
optional parameter to manually set the maximum number
of boosting iterations to be run. If not specified, it defaults to 100L.
This setting serves as a fallback when the stopping rule, based on
consecutive deviances and tuned by |
phi_gam_exit |
Convergence threshold for local-scoring and backfitting.
Both algorithms stop when the relative change in the deviance is below this
threshold. Default is |
q_gam |
numeric parameter which allows to fine-tune the stopping rule of
local-scoring/backfitting, by default equal to |
beta |
numeric parameter in the interval |
phi |
numeric parameter in the interval |
internal_knots |
The maximum number of internal knots that can be added
by the GeDS base-learners in each boosting iteration, effectively setting the
value of |
q |
numeric parameter which allows to fine-tune the stopping rule of
stage A of GeDS, by default equal to |
higher_order |
a logical that defines whether to compute the higher order
fits (quadratic and cubic) after the local-scoring algorithm is run. Default
is |
Details
The NGeDSgam
function employs the local scoring algorithm to fit a
Generalized Additive Model (GAM). This algorithm iteratively fits weighted
additive models by backfitting. Normal linear GeD splines, as well as linear
learners, are supported as function smoothers within the backfitting
algorithm. The local-scoring algorithm ultimately produces a linear fit.
Higher order fits (quadratic and cubic) are then computed by calculating the
Schoenberg’s variation diminishing spline (VDS) approximation of the linear
fit.
On the one hand, NGeDSgam
includes all the parameters of
NGeDS
, which in this case tune the smoother fit at each
backfitting iteration. On the other hand, NGeDSgam
includes some
additional parameters proper to the local-scoring procedure. We describe
the main ones as follows.
The family
chosen determines the link function, adjusted dependent
variable and weights to be used in the local-scoring algorithm. The number of
local-scoring and backfitting iterations is controlled by a
Ratio of Deviances stopping rule similar to the one presented for
GGeDS
. In the same way phi
and q
tune the stopping
rule of GGeDS
, phi_boost_exit
and q_boost
tune the
stopping rule of NGeDSgam
. The user can also manually control the number
of local-scoring iterations through min_iterations
and
max_iterations
.
Value
GeDSgam-Class
object, i.e. a list of items that
summarizes the main details of the fitted GAM-GeDS model. See
GeDSgam-Class
for details. Some S3 methods are available in
order to make these objects tractable, such as
coef
, knots
,
print
and predict
.
References
Hastie, T. and Tibshirani, R. (1986). Generalized Additive Models.
Statistical Science 1 (3) 297 - 310.
DOI: doi:10.1214/ss/1177013604
Kaishev, V.K., Dimitrova, D.S., Haberman, S. and Verrall, R.J. (2016).
Geometrically designed, variable knot regression splines.
Computational Statistics, 31, 1079–1105.
DOI: doi:10.1007/s00180-015-0621-7
Dimitrova, D. S., Kaishev, V. K., Lattuada, A. and Verrall, R. J. (2023).
Geometrically designed variable knot splines in generalized (non-)linear
models.
Applied Mathematics and Computation, 436.
DOI: doi:10.1016/j.amc.2022.127493
Dimitrova, D. S., Guillen, E. S. and Kaishev, V. K. (2024). GeDS: An R Package for Regression, Generalized Additive Models and Functional Gradient Boosting, based on Geometrically Designed (GeD) Splines. Manuscript submitted for publication.
See Also
NGeDS
; GGeDS
; GeDSgam-Class
;
S3 methods such as knots.GeDSgam
; coef.GeDSgam
;
deviance.GeDSgam
; predict.GeDSgam
Examples
# Load package
library(GeDS)
data(airquality)
data = na.omit(airquality)
data$Ozone <- data$Ozone^(1/3)
formula = Ozone ~ f(Solar.R) + f(Wind, Temp)
Gmodgam <- NGeDSgam(formula = formula, data = data,
phi_gam_exit = 0.995, phi = 0.995, q = 2)
MSE_Gmodgam_linear <- mean((data$Ozone - Gmodgam$predictions$pred_linear)^2)
MSE_Gmodgam_quadratic <- mean((data$Ozone - Gmodgam$predictions$pred_quadratic)^2)
MSE_Gmodgam_cubic <- mean((data$Ozone - Gmodgam$predictions$pred_cubic)^2)
cat("\n", "MEAN SQUARED ERROR", "\n",
"Linear NGeDSgam:", MSE_Gmodgam_linear, "\n",
"Quadratic NGeDSgam:", MSE_Gmodgam_quadratic, "\n",
"Cubic NGeDSgam:", MSE_Gmodgam_cubic, "\n")
## S3 methods for class 'GeDSboost'
# Print
print(Gmodgam)
# Knots
knots(Gmodgam, n = 2L)
knots(Gmodgam, n = 3L)
knots(Gmodgam, n = 4L)
# Coefficients
coef(Gmodgam, n = 2L)
coef(Gmodgam, n = 3L)
coef(Gmodgam, n = 4L)
# Deviances
deviance(Gmodgam, n = 2L)
deviance(Gmodgam, n = 3L)
deviance(Gmodgam, n = 4L)