dat {marble}R Documentation

simulated data for demonstrating the features of marble.

Description

Simulated gene expression data for demonstrating the features of marble.

Usage

data("dat")

Format

dat consists of four components: X, Y, E, clin.

Details

The data model for generating Y

Use subscript i to denote the ith subject. Let (Y_{i}, X_{i}, E_{i}, clin_{i}) (i=1,\ldots,n) be independent and identically distributed random vectors. Y_{i} is a continuous response variable representing the phenotype. X_{i} is the p–dimensional vector of genetic factors. The environmental factors and clinical factors are denoted as the q-dimensional vector E_{i} and the m-dimensional vector clin_{i}, respectively. The \epsilon follows some heavy-tailed distribution. For X_{ij} (j = 1,\ldots,p), the measurement of the jth genetic factor on the jth subject, considering the following model:

Y_{i} = \alpha_{0} + \sum_{k=1}^{q}\alpha_{k}E_{ik}+\sum_{t=1}^{m}\gamma_{t}clin_{it}+\beta_{j}X_{ij}+\sum_{k=1}^{q}\eta_{jk}X_{ij}E_{ik}+\epsilon_{i},

where \alpha_{0} is the intercept, \alpha_{k}'s and \gamma_{t}'s are the regression coefficients corresponding to effects of environmental and clinical factors, respectively. The \beta_{j}'s and \eta_{jk}'s are the regression coefficients of the genetic variants and G\timesE interactions effects, correspondingly. The G\timesE interactions effects are defined with W_{j} = (X_{j}E_{1},\ldots,X_{j}E_{q}). With a slight abuse of notation, denote \tilde{W} = W_{j}. Denote \alpha=(\alpha_{1}, \ldots, \alpha_{q})^{T}, \gamma=(\gamma_{1}, \ldots, \gamma_{m})^{T}, \beta=(\beta_{1}, \ldots, \beta_{p})^{T}, \eta=(\eta_{1}^{T}, \ldots, \eta_{p}^{T})^{T}, \tilde{W} = (\tilde{W_{1}}, \dots, \tilde{W_{p}}). Then model can be written as

Y_{i} = E_{i}\alpha + clin_{i}\gamma + X_{ij}\beta_{j} + \tilde{W}_{i}\eta_{j} + \epsilon_{i}.

See Also

marble

Examples

data(dat)
dim(X)

[Package marble version 0.0.3 Index]