R: Data Generating Process

DGP {AllelicSeries}

R Documentation

Data Generating Process

Description

Generate a data set consisting of:

"anno"A SNP-length annotation vector.
"covar"A subject by 6 covariate matrix.
"geno"A subject by SNP genotype matrix.
"pheno"A subject-length phenotype vector.

Usage

DGP(
  anno = NULL,
  beta = c(0, 1, 2),
  binary = FALSE,
  geno = NULL,
  include_residual = TRUE,
  indicator = FALSE,
  maf_range = c(0.005, 0.01),
  method = "none",
  n = 100,
  p_dmv = 0.4,
  p_ptv = 0.1,
  prop_causal = 1,
  random_signs = FALSE,
  random_var = 0,
  snps = 100,
  weights = c(1, 2, 3)
)

Arguments

`anno`	Annotation vector, if providing genotypes. Should match the number of columns in geno.
`beta`	If method = "none", a (3 x 1) coefficient vector for bmvs, dmvs, and ptvs respectively. If method != "none", a scalar effect size.
`binary`	Generate binary phenotype? Default: FALSE.
`geno`	Genotype matrix, if providing genotypes.
`include_residual`	Include residual? If FALSE, returns the expected value. Intended for testing.
`indicator`	Convert raw counts to indicators? Default: FALSE.
`maf_range`	Range of minor allele frequencies: c(MIN, MAX).
`method`	Genotype aggregation method. Default: "none".
`n`	Sample size.
`p_dmv`	Frequency of deleterious missense variants. Default of 40% is based on the frequency of DMVs among rare coding variants in the UK Biobank.
`p_ptv`	Frequency of protein truncating variants. Default of 10% is based on the frequency of PTVs among rare coding variants in the UK Biobank.
`prop_causal`	Proportion of variants which are causal. Default: 1.0.
`random_signs`	Randomize signs? FALSE for burden-type genetic architecture, TRUE for SKAT-type.
`random_var`	Frailty variance in the case of random signs. Default: 0.
`snps`	Number of SNP in the gene. Default: 100.
`weights`	Aggregation weights.

Value

List containing: genotypes, annotations, covariates, phenotypes.

Examples

# Generate data.
data <- DGP(n = 100)

# View components.
table(data$anno)
head(data$covar)
head(data$geno[, 1:5])
hist(data$pheno)

[Package AllelicSeries version 0.0.4.1 Index]