PGEE {geeVerse}R Documentation

PGEE accelerated with RCpp

Description

A function to fit penalized generalized estimating equation model. This function was re-wrote partly with RCPP and RCPPEigen for better computation efficiency.

Usage

PGEE(
  formula,
  id,
  data,
  na.action = NULL,
  family = gaussian(link = "identity"),
  corstr = "independence",
  Mv = NULL,
  beta_int = NULL,
  R = NULL,
  scale.fix = TRUE,
  scale.value = 1,
  lambda,
  pindex = NULL,
  eps = 10^-6,
  maxiter = 30,
  tol = 10^-3,
  silent = TRUE
)

Arguments

formula

A formula expression response ~ predictors;

id

A vector for identifying subjects/clusters.

data

A data frame which stores the variables in formula with id variable.

na.action

A function to remove missing values from the data. Only na.omit is allowed here.

family

A family object: a list of functions and expressions for defining link and variance functions. Families supported in PGEE are binomial, gaussian, gamma and poisson. The links, which are not available in gee, is not available here. The default family is gaussian.

corstr

A character string, which specifies the correlation of correlation structure. Structures supported in PGEE are "AR-1","exchangeable", "fixed", "independence", "stat_M_dep","non_stat_M_dep", and "unstructured". The default corstr correlation is "independence".

Mv

If either "stat_M_dep", or "non_stat_M_dep" is specified in corstr, then this assigns a numeric value for Mv. Otherwise, the default value is NULL.

beta_int

User specified initial values for regression parameters. The default value is NULL.

R

If corstr = "fixed" is specified, then R is a square matrix of dimension maximum cluster size containing the user specified correlation. Otherwise, the default value is NULL.

scale.fix

A logical variable; if true, the scale parameter is fixed at the value of scale.value. The default value is TRUE.

scale.value

If scale.fix = TRUE, this assigns a numeric value to which the scale parameter should be fixed. The default value is 1.

lambda

A numerical value for the penalization parameter of the scad function, which is estimated via cross-validation.

pindex

An index vector showing the parameters which are not subject to penalization. The default value is NULL. However, in case of a model with intercept, the intercept parameter should be never penalized.

eps

A numerical value for the epsilon used in minorization-maximization algorithm. The default value is 10^-6.

maxiter

The number of iterations that is used in the estimation algorithm. The default value is 25.

tol

The tolerance level that is used in the estimation algorithm. The default value is 10^-3.

silent

A logical variable; if false, the regression parameter estimates at each iteration are printed. The default value is TRUE.

Value

a PGEE object, which includes: fitted coefficients - the fitted single index coefficients with unit norm and first component being non negative

Examples

#generate data
set.seed(2021)
sim_data <- generateData(nsub = 100, nobs = rep(10, 100),  p = 100,
                         c(rep(1,7),rep(0,93)), rho = 0.6, correlation = "AR1",
                          dis = "normal", ka = 1)

X=sim_data$X
y=sim_data$y
id = rep(1:100, each = 10)
data = data.frame(X,y,id)

PGEE_fit = PGEE("y ~.-id-1",id = id, data = data,corstr = "exchangeable",lambda=0.01)
PGEE_fit$coefficients

[Package geeVerse version 0.2.1 Index]