roben {roben} | R Documentation |
fit a robust Bayesian variable selection
Description
fit a robust Bayesian variable selection model for G×E interactions.
Usage
roben(
X,
Y,
E,
clin = NULL,
iterations = 10000,
burn.in = NULL,
robust = TRUE,
sparse = TRUE,
structure = c("sparsegroup", "group", "individual"),
hyper = NULL,
debugging = FALSE
)
Arguments
X |
the matrix of predictors (genetic factors) without intercept. Each row should be an observation vector. X will be centered and a column of 1 will be added to the X matrix as the intercept. |
Y |
the response variable. The current version of roben only supports continuous response. |
E |
a matrix of environmental factors. E will be centered. The interaction terms between X (G factors) and E will be automatically created and included in the model. |
clin |
a matrix of clinical variables. Clinical variables are not subject to penalty. |
iterations |
the number of MCMC iterations. |
burn.in |
the number of iterations for burn-in. |
robust |
logical flag. If TRUE, robust methods will be used. |
sparse |
logical flag. If TRUE, spike-and-slab priors will be used to shrink coefficients of irrelevant covariates to zero exactly. |
structure |
three choices are available. "sparsegroup" for sparse-group selection, which is a bi-level selection on both group-level and individual-level. "group" for selection on group-level only. "individual" for selection on individual-level only. |
hyper |
a named list of hyperparameters. |
debugging |
logical flag. If TRUE, progress will be output to the console and extra information will be returned. |
Details
Consider the data model described in "data
":
where the main and interaction effects of the th genetic variant is corresponding to the coefficient vector
.
When structure="sparsegroup" (default setting), selection will be conducted on both individual and group levels (bi-level selection):
-
Group-level selection: by determining whether
, we can know if the
th genetic variant has any effect at all.
-
Individual-level selection: investigate whether the
th genetic variant has main effect, G
E interaction or both, by determining which components in
has non-zero values.
If structure="group", only group-level selection will be conducted on . If structure="individual", only individual-level selection will be conducted on each
, (
).
When sparse=TRUE (default), spike–and–slab priors are imposed on individual and/or group levels to identify important main and interaction effects. Otherwise, Laplacian shrinkage will be used.
When robust=TRUE (default), the distribution of is defined as a Laplace distribution with density
, (
), which leads to a Bayesian formulation of LAD regression. If robust=FALSE,
follows a normal distribution.
Both and
will be centered before the generation of interaction terms, in order to prevent the multicollinearity between main effects and interaction terms.
Users can modify the hyper-parameters by providing a named list of hyper-parameters via the argument ‘hyper’. The list can have the following named components
a0, b0: shape parameters of the Beta priors (
) on
.
a1, b1: shape parameters of the Beta priors (
) on
.
c1, c2: the shape parameter and the rate parameter of the Gamma prior on
.
d1, d2: the shape parameter and the rate parameter of the Gamma priors on
.
Please check the references for more details about the prior distributions.
See Also
Examples
data(GxE_small)
## default method
iter=5000
fit=roben(X, Y, E, clin, iterations = iter)
fit$coefficient
## Ture values of parameters of main G effects and interactions
coeff$GE
## Compute TP and FP
sel=GxESelection(fit)
pos=which(sel$indicator != 0)
tp=length(intersect(which(coeff$GE != 0), pos))
fp=length(pos)-tp
list(tp=tp, fp=fp)
## alternative: robust group selection
fit=roben(X, Y, E, clin, iterations=iter, structure="g")
fit$coefficient
## alternative: non-robust sparse group selection
fit=roben(X, Y, E, clin, iterations=iter, robust=FALSE)
fit$coefficient