R: PGEE accelerated with RCpp

PGEE {geeVerse}

R Documentation

PGEE accelerated with RCpp

Description

A function to fit penalized generalized estimating equation model. This function was re-wrote partly with RCPP and RCPPEigen for better computation efficiency.

Usage

PGEE(
  formula,
  id,
  data,
  na.action = NULL,
  family = gaussian(link = "identity"),
  corstr = "independence",
  Mv = NULL,
  beta_int = NULL,
  R = NULL,
  scale.fix = TRUE,
  scale.value = 1,
  lambda,
  pindex = NULL,
  eps = 10^-6,
  maxiter = 30,
  tol = 10^-3,
  silent = TRUE
)

Arguments

`formula`	A formula expression `response ~ predictors`;
`id`	A vector for identifying subjects/clusters.
`data`	A data frame which stores the variables in `formula` with `id` variable.
`na.action`	A function to remove missing values from the data. Only `na.omit` is allowed here.
`family`	A `family` object: a list of functions and expressions for defining `link` and `variance` functions. Families supported in `PGEE` are `binomial`, `gaussian`, `gamma` and `poisson`. The `links`, which are not available in `gee`, is not available here. The default family is `gaussian`.
`corstr`	A character string, which specifies the correlation of correlation structure. Structures supported in `PGEE` are `"AR-1"`,`"exchangeable"`, `"fixed"`, `"independence"`, `"stat_M_dep"`,`"non_stat_M_dep"`, and `"unstructured"`. The default `corstr` correlation is `"independence"`.
`Mv`	If either `"stat_M_dep"`, or `"non_stat_M_dep"` is specified in `corstr`, then this assigns a numeric value for `Mv`. Otherwise, the default value is `NULL`.
`beta_int`	User specified initial values for regression parameters. The default value is `NULL`.
`R`	If `corstr = "fixed"` is specified, then `R` is a square matrix of dimension maximum cluster size containing the user specified correlation. Otherwise, the default value is `NULL`.
`scale.fix`	A logical variable; if true, the scale parameter is fixed at the value of `scale.value`. The default value is `TRUE`.
`scale.value`	If `scale.fix = TRUE`, this assigns a numeric value to which the scale parameter should be fixed. The default value is 1.
`lambda`	A numerical value for the penalization parameter of the scad function, which is estimated via cross-validation.
`pindex`	An index vector showing the parameters which are not subject to penalization. The default value is `NULL`. However, in case of a model with intercept, the intercept parameter should be never penalized.
`eps`	A numerical value for the epsilon used in minorization-maximization algorithm. The default value is `10^-6`.
`maxiter`	The number of iterations that is used in the estimation algorithm. The default value is `25`.
`tol`	The tolerance level that is used in the estimation algorithm. The default value is `10^-3`.
`silent`	A logical variable; if false, the regression parameter estimates at each iteration are printed. The default value is `TRUE`.

Value

a PGEE object, which includes: fitted coefficients - the fitted single index coefficients with unit norm and first component being non negative

Examples

#generate data
set.seed(2021)
sim_data <- generateData(nsub = 100, nobs = rep(10, 100),  p = 100,
                         c(rep(1,7),rep(0,93)), rho = 0.6, correlation = "AR1",
                          dis = "normal", ka = 1)

X=sim_data$X
y=sim_data$y
id = rep(1:100, each = 10)
data = data.frame(X,y,id)

PGEE_fit = PGEE("y ~.-id-1",id = id, data = data,corstr = "exchangeable",lambda=0.01)
PGEE_fit$coefficients

[Package geeVerse version 0.2.1 Index]