hglm {holiglm}R Documentation

Fitting Holistic Generalized Linear Models

Description

Fit a generalized linear model under holistic constraints.

Usage

hglm(
  formula,
  family = gaussian(),
  data,
  constraints = NULL,
  weights = NULL,
  scaler = c("auto", "center_standardization", "center_minmax", "standardization",
    "minmax", "off"),
  scale_response = NULL,
  big_m = 100,
  solver = "auto",
  control = list(),
  dry_run = FALSE,
  object_size = c("normal", "big")
)

holiglm(
  formula,
  family = gaussian(),
  data,
  constraints = NULL,
  weights = NULL,
  scaler = c("auto", "center_standardization", "center_minmax", "standardization",
    "minmax", "off"),
  scale_response = NULL,
  big_m = 100,
  solver = "auto",
  control = list(),
  dry_run = FALSE,
  object_size = c("normal", "big")
)

hglm_seq(
  k_seq,
  formula,
  family = gaussian(),
  data,
  constraints = NULL,
  weights = NULL,
  scaler = c("auto", "center_standardization", "center_minmax", "standardization",
    "minmax", "off"),
  big_m = 100,
  solver = "auto",
  control = list(),
  object_size = c("normal", "big"),
  parallel = FALSE
)

Arguments

formula

an object of class "formula" giving the symbolic description of the model to be fitted.

family

a description of the error distribution and link function to be used in the model.

data

a data.frame or matrix giving the data for the estimation.

constraints

a list of 'HGLM' constraints stored in a list of class "lohglmc". Use NULL to turn off constraints.

weights

an optional vector of 'prior weights' to be used for the estimation.

scaler

a character string giving the name of the scaling function (default is "auto") to be employed for the covariates. This typically does not need to be changed.

scale_response

a boolean whether the response shall be standardized or not. Can only be used with family gaussian(). Default is TRUE for family gaussian() and FALSE for other families.

big_m

an upper bound for the coefficients, needed for the big-M constraint. Required to inherit from "hglmc". Currently constraints created by group_sparsity(), group_inout(), include() and group_equal() use the big-M value specified here.

solver

a character string giving the name of the solver to be used for the estimation.

control

a list of control parameters passed to ROI_solve.

dry_run

a logical; if TRUE the model is not fit but only constructed.

object_size

a character string giving the object size, allowed values are "normal" and "big". If "big" is choosen, also the ROI solution and the "hglm_model" object are returned.

k_seq

an integer vector giving the values of k_max for which the model should be estimated.

parallel

whether estimation of sequence shall be parallelized

Details

In the case of binding linear constraints the standard errors are corrected, more information about the correction can be found in Schwendinger, Schwendinger and Vana (2024) doi:10.18637/jss.v108.i07.

Value

An object of class "hglm" inheriting from "glm".

References

Schwendinger B., Schwendinger F., Vana L. (2024). Holistic Generalized Linear Models doi:10.18637/jss.v108.i07

Bertsimas, D., & King, A. (2016). OR Forum-An Algorithmic Approach to Linear Regression Operations Research 64(1):2-16. doi:10.1287/opre.2015.1436

McCullagh, P., & Nelder, J. A. (2019). Generalized Linear Models (2nd ed.) Routledge. doi:10.1201/9780203753736.

Dobson, A. J., & Barnett, A. G. (2018). An Introduction to Generalized Linear Models (4th ed.) Chapman and Hall/CRC. doi:10.1201/9781315182780

Chares, Robert. (2009). “Cones and Interior-Point Algorithms for Structured Convex Optimization involving Powers and Exponentials.”

Chen, J., & Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95 (3): 759–771. Oxford University Press. doi:10.1093/biomet/asn034

Zhu, J., Wen, C., Zhu, J., Zhang, H., & Wang, X. (2020). A polynomial algorithm for best-subset selection problem. Proceedings of the National Academy of Sciences, 117 (52): 33117–33123. doi:10.1073/pnas.2014241117

Examples

dat <- rhglm(100, c(1, 2, -3, 4, 5, -6))
hglm(y ~ ., constraints = NULL, data = dat)
# estimation without constraints
hglm(y ~ ., constraints = NULL, data = dat)
# estimation with an upper bound on the number of coefficients to be selected
hglm(y ~ ., constraints = k_max(3), data = dat)
# estimation without intercept
hglm(y ~ . - 1, data = dat)


[Package holiglm version 1.0.0 Index]