R: Functions used to fit GeDS objects w/univariate spline...

UnivariateFitters {GeDS}

R Documentation

Functions used to fit GeDS objects w/univariate spline regression component

Description

These are computing engines called by NGeDS and GGeDS, needed for the underlying fitting procedures.

Usage

UnivariateFitter(
  X,
  Y,
  Z = NULL,
  offset = rep(0, NROW(Y)),
  weights = rep(1, length(X)),
  beta = 0.5,
  phi = 0.5,
  min.intknots = 0,
  max.intknots = 300,
  q = 2,
  extr = range(X),
  show.iters = FALSE,
  tol = as.double(1e-12),
  stoptype = c("SR", "RD", "LR"),
  higher_order = TRUE,
  intknots = NULL,
  only_predictions = FALSE
)

GenUnivariateFitter(
  X,
  Y,
  Z = NULL,
  offset = rep(0, NROW(Y)),
  weights = rep(1, length(X)),
  family = gaussian(),
  beta = 0.5,
  phi = 0.5,
  min.intknots = 0,
  max.intknots = 300,
  q = 2,
  extr = range(X),
  show.iters = F,
  tol = as.double(1e-12),
  stoptype = c("SR", "RD", "LR"),
  higher_order = TRUE
)

Arguments

`X`	a numeric vector containing `N` sample values of the covariate chosen to enter the spline regression component of the predictor model.
`Y`	a vector of size `N` containing the observed values of the response variable `y`.
`Z`	a design matrix with `N` rows containing other covariates selected to enter the parametric component of the predictor model (see `formula`). If no such covariates are selected, it is set to `NULL` by default.
`offset`	a vector of size `N` that can be used to specify a fixed covariate to be included in the predictor model avoiding the estimation of its corresponding regression coefficient. In case more than one covariate is fixed, the user should sum the corresponding coordinates of the fixed covariates to produce one common `N`-vector of coordinates. The `offset` argument is particularly useful when using `GenUnivariateFitter` if the link function used is not the identity.
`weights`	an optional vector of size `N` of ‘prior weights’ to be put on the observations in the fitting process in case the user requires weighted GeDS fitting. It is `NULL` by default.
`beta`	numeric parameter in the interval `[0,1]` tuning the knot placement in stage A of GeDS. See the description of `NGeDS` or `GGeDS`.
`phi`	numeric parameter in the interval `[0,1]` specifying the threshold for the stopping rule (model selector) in stage A of GeDS. See also `stoptype` and details in the description of `NGeDS` or `GGeDS`.
`min.intknots`	optional parameter allowing the user to set a minimum number of internal knots required. By default equal to zero.
`max.intknots`	optional parameter allowing the user to set a maximum number of internal knots to be added by the GeDS estimation algorithm. By default equal to the number of internal knots `\kappa` for the saturated GeDS model (i.e. `\kappa=N-2`).
`q`	numeric parameter which allows to fine-tune the stopping rule of stage A of GeDS, by default equal to 2. See details in the description of `NGeDS` or `GGeDS`.
`extr`	numeric vector of 2 elements representing the left-most and right-most limits of the interval embedding the sample values of `X`. By default equal correspondingly to the smallest and largest values of `X`.
`show.iters`	logical variable indicating whether or not to print information at each step. By default equal to `FALSE`.
`tol`	numeric value indicating the tolerance to be used in the knot placement steps in stage A. By default equal to 1E-12. See details below.
`stoptype`	a character string indicating the type of GeDS stopping rule to be used. It should be either `"SR"`, `"RD"` or `"LR"`, partial match allowed. See details of `NGeDS` or `GGeDS`.
`higher_order`	a logical that defines whether to compute the higher order fits (quadratic and cubic) after stage A is run. Default is `TRUE`.
`intknots`	vector of initial internal knots from which to start the GeDS Stage A iterations. See Section 3 of Kaishev et al. (2016). Default is `NULL`.
`only_predictions`	logical, if `TRUE` only predictions are computed.
`family`	a description of the error distribution and link function to be used in the model. This can be a character string naming a family function (e.g. `"gaussian"`), the family function itself (e.g. `gaussian`) or the result of a call to a family function (e.g. `gaussian()`). See family for details on family functions.

Details

The functions UnivariateFitter and GenUnivariateFitter are in general not intended to be used directly, they should be called through NGeDS and GGeDS. However, in case there is a need for multiple GeDS fitting (as may be the case e.g. in Monte Carlo simulations) it may be efficient to use the fitters outside the main functions.

The argument tol is used in the knot placement procedure of stage A of the GeDS algorithm in order to check whether the current knot \delta^* is set at an acceptable location or not. If there exists a knot \delta_i such that |\delta^* - \delta_i| < tol, \delta^*, then the new knot is considered to be coalescent with an existing one, it is discarded and the algorithm seeks alternative knot locations. By default it is equal to 1e-12.

See NGeDS and GGeDS, Kaishev et al. (2016) and Dimitrova et al. (2023) for further details.

Value

A GeDS-Class object, but without the Formula, extcall, terms and znames slots.

References

Kaishev, V.K., Dimitrova, D.S., Haberman, S., & Verrall, R.J. (2016). Geometrically designed, variable knot regression splines. Computational Statistics, 31, 1079–1105.
DOI: doi:10.1007/s00180-015-0621-7

Dimitrova, D. S., Kaishev, V. K., Lattuada, A. and Verrall, R. J. (2023). Geometrically designed variable knot splines in generalized (non-)linear models. Applied Mathematics and Computation, 436.
DOI: doi:10.1016/j.amc.2022.127493

Examples

# Examples similar to the ones
# presented in NGeDS and in GGeDS

# Generate a data sample for the response variable
# Y and the covariate X
set.seed(123)
N <- 500
f_1 <- function(x) (10*x/(1+100*x^2))*4+4
X <- sort(runif(N ,min = -2, max = 2))
# Specify a model for the mean of Y to include only
# a component non-linear in X, defined by the function f_1
means <- f_1(X)
# Add (Normal) noise to the mean of Y
Y <- rnorm(N, means, sd = 0.1)

# Fit a Normal GeDS regression model using the fitter function
(Gmod <- UnivariateFitter(X, Y, beta = 0.6, phi = 0.995,
           extr = c(-2,2)))

##############################################################
# second: very similar example, but based on Poisson data
set.seed(123)
X <- sort(runif(N , min = -2, max = 2))
means <- exp(f_1(X))
Y <- rpois(N,means)
(Gmod2 <- GenUnivariateFitter(X, Y, beta = 0.2,
            phi = 0.995, family = poisson(), extr = c(-2,2)))

# a plot showing quadratic and cubic fits,
# in the predictor scale
plot(X,log(Y), xlab = "x", ylab = expression(f[1](x)))
lines(Gmod2, n = 3, col = "red")
lines(Gmod2, n = 4, col = "blue", lty = 2)
legend("topleft", c("Quadratic","Cubic"),
     col = c("red","blue"), lty = c(1,2))

[Package GeDS version 0.2.3 Index]