roben {roben}R Documentation

fit a robust Bayesian variable selection

Description

fit a robust Bayesian variable selection model for G×E interactions.

Usage

roben(
  X,
  Y,
  E,
  clin = NULL,
  iterations = 10000,
  burn.in = NULL,
  robust = TRUE,
  sparse = TRUE,
  structure = c("sparsegroup", "group", "individual"),
  hyper = NULL,
  debugging = FALSE
)

Arguments

X

the matrix of predictors (genetic factors) without intercept. Each row should be an observation vector. X will be centered and a column of 1 will be added to the X matrix as the intercept.

Y

the response variable. The current version of roben only supports continuous response.

E

a matrix of environmental factors. E will be centered. The interaction terms between X (G factors) and E will be automatically created and included in the model.

clin

a matrix of clinical variables. Clinical variables are not subject to penalty.

iterations

the number of MCMC iterations.

burn.in

the number of iterations for burn-in.

robust

logical flag. If TRUE, robust methods will be used.

sparse

logical flag. If TRUE, spike-and-slab priors will be used to shrink coefficients of irrelevant covariates to zero exactly.

structure

three choices are available. "sparsegroup" for sparse-group selection, which is a bi-level selection on both group-level and individual-level. "group" for selection on group-level only. "individual" for selection on individual-level only.

hyper

a named list of hyperparameters.

debugging

logical flag. If TRUE, progress will be output to the console and extra information will be returned.

Details

Consider the data model described in "data":

Yi=α0+t=1qαtClinit+m=1kθmEim+j=1p(Uijβj)+ϵi,Y_{i} = \alpha_{0} + \sum_{t=1}^{q}\alpha_{t}Clin_{it} + \sum_{m=1}^{k}\theta_{m}E_{im} + \sum_{j=1}^{p} \big(U_{ij}^\top\beta_{j}\big) +\epsilon_{i},

where the main and interaction effects of the jjth genetic variant is corresponding to the coefficient vector βj=(βj1,βj2,,βjL)\beta_{j}=(\beta_{j1}, \beta_{j2},\ldots,\beta_{jL})^\top.

When structure="sparsegroup" (default setting), selection will be conducted on both individual and group levels (bi-level selection):

If structure="group", only group-level selection will be conducted on βj2||\beta_{j}||_{2}. If structure="individual", only individual-level selection will be conducted on each βjl\beta_{jl}, (l=1,,Ll=1,\ldots,L).

When sparse=TRUE (default), spike–and–slab priors are imposed on individual and/or group levels to identify important main and interaction effects. Otherwise, Laplacian shrinkage will be used.

When robust=TRUE (default), the distribution of ϵi\epsilon_{i} is defined as a Laplace distribution with density f(ϵiν)=ν2exp{νϵi} f(\epsilon_{i}|\nu) = \frac{\nu}{2}\exp\left\{-\nu |\epsilon_{i}|\right\} , (i=1,,ni=1,\dots,n), which leads to a Bayesian formulation of LAD regression. If robust=FALSE, ϵi\epsilon_{i} follows a normal distribution.

Both XX and EE will be centered before the generation of interaction terms, in order to prevent the multicollinearity between main effects and interaction terms.

Users can modify the hyper-parameters by providing a named list of hyper-parameters via the argument ‘hyper’. The list can have the following named components

Please check the references for more details about the prior distributions.

See Also

GxESelection

Examples

data(GxE_small)

## default method
iter=5000
fit=roben(X, Y, E, clin, iterations = iter)
fit$coefficient

## Ture values of parameters of main G effects and interactions
coeff$GE

## Compute TP and FP
sel=GxESelection(fit)
pos=which(sel$indicator != 0)
tp=length(intersect(which(coeff$GE != 0), pos))
fp=length(pos)-tp
list(tp=tp, fp=fp)


## alternative: robust group selection
fit=roben(X, Y, E, clin, iterations=iter, structure="g")
fit$coefficient

## alternative: non-robust sparse group selection
fit=roben(X, Y, E, clin, iterations=iter, robust=FALSE)
fit$coefficient



[Package roben version 0.1.1 Index]