springer {springer} | R Documentation |
fit the model with given tuning parameters
Description
This function performs penalized variable selection for longitudinal data based on generalized estimating equation (GEE) or quadratic inference functions (QIF) with a given value of lambda. Typical usage is to first obtain the optimal lambda using cross validation, then provide it to the springer function.
Usage
springer(
clin = NULL,
e,
g,
y,
beta0,
func,
corr,
structure,
lam1,
lam2,
maxits = 30,
tol = 0.001
)
Arguments
clin |
a matrix of clinical covariates. The default value is NULL. Whether to include the clinical covariates is decided by user. |
e |
a matrix of environment factors. |
g |
a matrix of genetic factors. |
y |
the longitudinal response. |
beta0 |
the initial coefficient vector |
func |
the framework to obtain the score equation. Two choices are available: "GEE" and "QIF". |
corr |
the working correlation structure adopted in the estimation algorithm. The springer provides three choices for the working correlation structure: exchangeable, AR-1,and independence. |
structure |
Three choices are available for structured variable selection. "bilevel" for sparse-group selection on both group-level and individual-level. "group" for selection on group-level only. "individual" for selection on individual-level only. |
lam1 |
the tuning parameter |
lam2 |
the tuning parameter |
maxits |
the maximum number of iterations that is used in the estimation algorithm. The default value is 30. |
tol |
The tolerance level. Coefficients with absolute values that are smaller than the tolerance level will be set to zero. The adhoc value can be chosen as 0.001. |
Details
Look back to the data model described in "dat
":
Y_{ij}= \alpha_0 + \sum_{m=1}^{t}\theta_m Clin_{ijm} + \sum_{u=1}^{q}\alpha_u E_{iju} + \sum_{v=1}^{p}\eta_v^\top Z_{ijv}+\epsilon_{ij},
where Z_{ijv}
contains the v
th genetic main factor and its interactions with the q
environment factors for the j
th measurement on the i
th subject
and \eta_{v}
is the corresponding coefficient vector of length 1+q
.
When structure="bilevel", variable selection for genetic main effects and gene-environment interactions under the longitudinal response will be conducted on both individual and group levels (bi-level selection):
-
Group-level selection: by determining whether
||\eta_{v}||_{2}=0
, we can know if thev
th genetic variant has any effect at all. -
Individual-level selection: investigate whether the
v
th genetic variant has main effect, G\times
E interaction or both, by determining which components in\eta_{v}
has non-zero values.
If structure="group", only group-level selection will be conducted on ||\eta_{v}||_{2}
; if structure="individual", only individual-level selection will be conducted on each \eta_{vu}
, (u=1,\ldots,q
).
This function also provides choices for the framework that is used. If func="QIF", variable selection will be conducted within the quadratic inference functions framework; if func="GEE", variable selection will be conducted within the generalized estimating equation framework.
There are three options for the choice of the working correlation. If corr="exchangeable", the exchangeable working correlation will be applied; if corr="AR-1", the AR-1 working correlation will be adopted; if corr="independence", the independence working correlation will be used. Please check the references for more details.
Value
coef |
the coefficient vector. |
Examples
data("dat")
##load the clinical covariates, environment factors, genetic factors and response from the
##"dat" file
clin=dat$clin
if(is.null(clin)){t=0} else{t=dim(clin)[2]}
e=dat$e
u=dim(e)[2]
g=dat$g
y=dat$y
##initial coefficient
beta0=dat$coef
##true nonzero coefficients
index=dat$index
beta = springer(clin=clin, e, g, y,beta0,func="GEE",corr="independence",structure="bilevel",
lam1=dat$lam1, lam2=dat$lam2,maxits=30,tol=0.01)
##only focus on the genetic main effects and gene-environment interactions
beta[1:(1+t+u)]=0
##effects that have nonzero coefficients
pos = which(beta != 0)
##true positive and false positive
tp = length(intersect(index, pos))
fp = length(pos) - tp
list(tp=tp, fp=fp)