dat {springer} | R Documentation |
simulated data for demonstrating the usage of springer
Description
Simulated gene expression data for demonstrating the usage of springer.
Usage
data("dat")
Format
The dat file consists of five components: e, g, y, clin and coeff. The coefficients are the true values of parameters used for generating Y.
Details
The data model for generating Y
Consider a longitudinal case study with subjects and
measurements over time for the
th subject (
).
Let
be the response of the
th observation for the
th subject (
,
),
be a
-dimensional vector of covariates denoting
genetic factors,
be a
-dimensional environmental factor and
be a
-dimensional clinical factor. There is time dependence among measurements on the same subject, but we assume that the measurements
between different subjects are independent. The model we used for hierarchical variable selection for gene–environment interactions is given as:
where is the intercept and the marginal density of
belongs to a canonical exponential family defined in Liang and Zeger (1986).
Define
, which is a vector of length q+1 and
,
which contains the main genetic effect of the
th SNP from the
th measurement on the
th subject and its interactions with all the
environmental factors. The model can be written as:
where is the
th genetic factor and its interactions with the
environment factors for the
th measurement on the
th subject,
and
is the corresponding coefficient vector of length
. The random error
, which is assumed to follow a multivariate normal distribution with
as the covariance matrix for the repeated measurements of the
subject among the
time points.