| generate_phenodata {CJAMP} | R Documentation | 
Functions to generate phenotype data.
Description
Functions to generate standard normal or binary phenotypes based on provided genetic
data, for specified effect sizes.
The functions generate_phenodata_1_simple and
generate_phenodata_1 generate one phenotype Y conditional on
single nucleotide variants (SNVs) and two covariates.
generate_phenodata_2_bvn as well as generate_phenodata_2_copula
generate two phenotypes Y_1, Y_2 with dependence Kendall's tau conditional on
the provided SNVs and two covariates.
Usage
generate_phenodata_1_simple(genodata = NULL, type = "quantitative",
  b = 0, a = c(0, 0.5, 0.5))
generate_phenodata_1(genodata = NULL, type = "quantitative", b = 0.6,
  a = c(0, 0.5, 0.5), MAF_cutoff = 1, prop_causal = 0.1,
  direction = "a")
generate_phenodata_2_bvn(genodata = NULL, tau = NULL, b1 = 0,
  b2 = 0, a1 = c(0, 0.5, 0.5), a2 = c(0, 0.5, 0.5))
generate_phenodata_2_copula(genodata = NULL, phi = NULL, tau = 0.5,
  b1 = 0.6, b2 = 0.6, a1 = c(0, 0.5, 0.5), a2 = c(0, 0.5, 0.5),
  MAF_cutoff = 1, prop_causal = 0.1, direction = "a")
Arguments
| genodata | Numeric input vector or dataframe containing the genetic variant(s) in columns. Must be in allelic coding 0, 1, 2. | 
| type | String with value  | 
| b | Integer or vector specifying the genetic effect size(s) of
the provided SNVs ( | 
| a | Numeric vector specifying the effect sizes of the covariates  | 
| MAF_cutoff | Integer specifying a minor allele frequency cutoff to determine among which SNVs the causal SNVs are sampled for the phenotype generation. | 
| prop_causal | Integer specifying the desired percentage of causal SNVs among all SNVs. | 
| direction | String with value  | 
| tau | Integer specifying Kendall's tau, which determines the dependence between the two generated phenotypes. | 
| b1 | Integer or vector specifying the genetic effect size(s) of
the provided SNVs ( | 
| b2 | Integer or vector specifying the genetic effect size(s) of
the provided SNVs ( | 
| a1 | Numeric vector specifying the effect sizes of the covariates  | 
| a2 | Numeric vector specifying the effect sizes of the covariates  | 
| phi | Integer specifying the parameter  | 
Details
In more detail, the function generate_phenodata_1_simple
generates a quantitative or binary phenotype Y with n observations,
conditional on the specified SNVs with given effect sizes and conditional
on one binary and one standard normally-distributed covariate with
specified effect sizes. n is given through the provided SNVs.
generate_phenodata_1 provides an extension of
generate_phenodata_1_simple and allows to further select
the percentage of causal SNVs, a minor allele frequency cutoff on the
causal SNVs, and varying effect directions. n is given through the
provided SNVs.
The function generate_phenodata_2_bvn generates
two quantitative phenotypes Y_1, Y_2 conditional on one binary and one
standard normally-distributed covariate X_1, X_2 from the bivariate
normal distribution so that they have have dependence \tau given
by Kendall's tau.
The function generate_phenodata_2_copula generates
two quantitative phenotypes Y_1, Y_2 conditional on one binary and one
standard normally-distributed covariate X_1, X_2 from the Clayton copula
so that Y_1, Y_2 are marginally normally distributed and have dependence
Kendall's tau specified by tau or phi, using the function
generate_clayton_copula.
The genetic effect sizes are the specified numeric values b and
b1, b2, respectively, in the functions generate_phenodata_1_simple
and generate_phenodata_2_bvn. In
generate_phenodata_1 and generate_phenodata_2_copula,
the genetic effect sizes are computed by multiplying b or b1, b2,
respectively, with the absolute value of the log10-transformed
minor allele frequencies, so that rarer variants have larger effect sizes.
Value
A dataframe containing n observations of the phenotype Y or phenotypes
Y_1, Y_2 and of the covariates X_1, X_2.
Examples
# Generate genetic data:
set.seed(10)
genodata <- generate_genodata(n_SNV = 20, n_ind = 1000)
compute_MAF(genodata)
# Generate different phenotype data:
phenodata1 <- generate_phenodata_1_simple(genodata = genodata[,1],
                                          type = "quantitative", b = 0)
phenodata2 <- generate_phenodata_1_simple(genodata = genodata[,1],
                                          type = "quantitative", b = 2)
phenodata3 <- generate_phenodata_1_simple(genodata = genodata,
                                          type = "quantitative", b = 2)
phenodata4 <- generate_phenodata_1_simple(genodata = genodata,
                                          type = "quantitative",
                                          b = seq(0.1, 2, 0.1))
phenodata5 <- generate_phenodata_1_simple(genodata = genodata[,1],
                                          type = "binary", b = 0)
phenodata6 <- generate_phenodata_1(genodata = genodata[,1],
                                   type = "quantitative", b = 0,
                                   MAF_cutoff = 1, prop_causal = 0.1,
                                   direction = "a")
phenodata7 <- generate_phenodata_1(genodata = genodata,
                                   type = "quantitative", b = 0.6,
                                   MAF_cutoff = 0.1, prop_causal = 0.05,
                                   direction = "a")
phenodata8 <- generate_phenodata_1(genodata = genodata,
                                   type = "quantitative",
                                   b = seq(0.1, 2, 0.1),
                                   MAF_cutoff = 0.1, prop_causal = 0.05,
                                   direction = "a")
phenodata9 <- generate_phenodata_2_bvn(genodata = genodata[,1],
                                       tau = 0.5, b1 = 0, b2 = 0)
phenodata10 <- generate_phenodata_2_bvn(genodata = genodata,
                                        tau = 0.5, b1 = 0, b2 = 0)
phenodata11 <- generate_phenodata_2_bvn(genodata = genodata,
                                        tau = 0.5, b1 = 1,
                                        b2 = seq(0.1,2,0.1))
phenodata12 <- generate_phenodata_2_bvn(genodata = genodata,
                                        tau = 0.5, b1 = 1, b2 = 2)
par(mfrow = c(3, 1))
hist(phenodata12$Y1)
hist(phenodata12$Y2)
plot(phenodata12$Y1, phenodata12$Y2)
phenodata13 <- generate_phenodata_2_copula(genodata = genodata[,1],
                                           MAF_cutoff = 1, prop_causal = 1,
                                           tau = 0.5, b1 = 0, b2 = 0)
phenodata14 <- generate_phenodata_2_copula(genodata = genodata,
                                           MAF_cutoff = 1, prop_causal = 0.5,
                                           tau = 0.5, b1 = 0, b2 = 0)
phenodata15 <- generate_phenodata_2_copula(genodata = genodata,
                                           MAF_cutoff = 1, prop_causal = 0.5,
                                           tau = 0.5, b1 = 0, b2 = 0)
phenodata16 <- generate_phenodata_2_copula(genodata = genodata,
                                           MAF_cutoff = 1, prop_causal = 0.5,
                                           tau = 0.2, b1 = 0.3,
                                           b2 = seq(0.1, 2, 0.1))
phenodata17 <- generate_phenodata_2_copula(genodata = genodata,
                                           MAF_cutoff = 1, prop_causal = 0.5,
                                           tau = 0.2, b1 = 0.3, b2 = 0.3)
par(mfrow = c(3, 1))
hist(phenodata17$Y1)
hist(phenodata17$Y2)
plot(phenodata17$Y1, phenodata17$Y2)