dat {marble} | R Documentation |
simulated data for demonstrating the features of marble.
Description
Simulated gene expression data for demonstrating the features of marble.
Usage
data("dat")
Format
dat consists of four components: X, Y, E, clin.
Details
The data model for generating Y
Use subscript i
to denote the i
th subject. Let (Y_{i}, X_{i}, E_{i}, clin_{i})
(i=1,\ldots,n
) be
independent and identically distributed random vectors. Y_{i}
is a continuous response variable representing the
phenotype. X_{i}
is the p
–dimensional vector of genetic factors. The environmental factors and clinical factors
are denoted as the q
-dimensional vector E_{i}
and the m
-dimensional vector clin_{i}
, respectively.
The \epsilon
follows some heavy-tailed distribution. For X_{ij}
(j = 1,\ldots,p
), the measurement of the j
th genetic factor on the j
th subject,
considering the following model:
Y_{i} = \alpha_{0} + \sum_{k=1}^{q}\alpha_{k}E_{ik}+\sum_{t=1}^{m}\gamma_{t}clin_{it}+\beta_{j}X_{ij}+\sum_{k=1}^{q}\eta_{jk}X_{ij}E_{ik}+\epsilon_{i},
where \alpha_{0}
is the intercept, \alpha_{k}
's and \gamma_{t}
's are the regression coefficients corresponding to effects of environmental and clinical factors, respectively.
The \beta_{j}
's and \eta_{jk}
's are the regression coefficients of the genetic variants and G\times
E interactions effects, correspondingly.
The G\times
E interactions effects are defined with W_{j} = (X_{j}E_{1},\ldots,X_{j}E_{q}).
With a slight abuse of notation, denote \tilde{W} = W_{j}.
Denote \alpha=(\alpha_{1}, \ldots, \alpha_{q})^{T}
, \gamma=(\gamma_{1}, \ldots, \gamma_{m})^{T}
, \beta=(\beta_{1}, \ldots, \beta_{p})^{T}
, \eta=(\eta_{1}^{T}, \ldots, \eta_{p}^{T})^{T}
, \tilde{W} = (\tilde{W_{1}}, \dots, \tilde{W_{p}})
.
Then model can be written as
Y_{i} = E_{i}\alpha + clin_{i}\gamma + X_{ij}\beta_{j} + \tilde{W}_{i}\eta_{j} + \epsilon_{i}.
See Also
Examples
data(dat)
dim(X)