rbm.GRR {Ravages} | R Documentation |
Simulation of genetic data using GRR values
Description
Generates a simulated bed.matrix with genotypes for cases and controls based on GRR values
Usage
rbm.GRR(genes.maf = Kryukov, size, prev, replicates,
GRR.matrix.del, GRR.matrix.pro = NULL,
p.causal = 0.5, p.protect = 0, same.variant = FALSE,
genetic.model=c("general", "multiplicative", "dominant", "recessive"),
select.gene, selected.controls = T, max.maf.causal = 0.01)
Arguments
genes.maf |
A dataframe containing at least the MAF in the general population (column maf) for variants with their associated gene (column gene), by default the file |
size |
A vector containing the size of each group (the first one being the control group) |
prev |
A vector containing the prevalence of each group of cases |
replicates |
The number of simulations to perform |
GRR.matrix.del |
A list containing the GRR matrix associated to the heterozygous genotype compared to the homozygous reference genotype as if all variants are deleterious. An additional GRR matrix associated to the homozygous for the alternate allele is needed if |
GRR.matrix.pro |
The same argument as |
p.causal |
The proportion of causal variants in cases |
p.protect |
The proportion of protective variants in cases among causal variants |
same.variant |
TRUE/FALSE: whether the causal variants are the same in the different groups of cases |
genetic.model |
The genetic model of the disease |
select.gene |
Which gene to choose from |
selected.controls |
Whether controls are selected controls (by default) or controls from the general population |
max.maf.causal |
Only variants with a MAF lower than this threshold can be sampled as causal variants. |
Details
The genetic model of the disease needs to be specified in this function.
If genetic.model="general"
, there is no link between the GRR for the heterozygous genotype and the GRR for the homozygous alternative genotype.
Therefore, the user has to give two matrices of GRR, one for the heterozygous genotype, the other for the homozygous alternative genotype.
If genetic.model="multiplicative"
, we assume that the the GRR for the homozygous alternative genotype is the square of the GRR for the heterozygous genotype.
If genetic.model="dominant"
, we assume that the GRR for the heterozygous genotype and the GRR for the homozygous alternative genotype are equal.
If genetic.model="recessive"
, we assume that the GRR for the heterozygous genotype is equal to 1: the GRR given is the one associated to the homozygous alternative genotype.
GRR.matrix.del
contains GRR values as if all variants are deleterious. These values will be used only for the proportion p.causal
of variants that will be sampled as causal.
If selected.controls
= T, genotypic frequencies in the control group are computed from genotypic frequencies in the cases groups and the prevalence of the disease.
If FALSE, genotypic frequencies in the control group are computed from allelic frequencies under Hardy-Weinberg equilibrium.
The files Kryukov
or GnomADgenes
available with the package Ravages can be used as the argument genes.maf
.
If GRR.matrix.del
(or GRR.matrix.pro
) has been generated using the function GRR.matrix
, the arguments genes.maf
and select.gene
should have
the same value as in GRR.matrix
.
Only non-monomorphic variants are kept for the simulations.
Causal variants that have been sampled in each group of individuals are indicated in x@ped$Causal
.
Value
A bed.matrix with as much columns (variants) as replicates
*number of variants.
The field x@snps$genomic.region
contains the replicate number and the field x@ped$pheno
contrains the group of each individual, "0" being the controls group.
See Also
GRR.matrix
, Kryukov
, GnomADgenes
, rbm.GRR.power
Examples
#GRR values calculated with the SKAT formula
GRR.del <- GRR.matrix(GRR = "SKAT", genes.maf = Kryukov,
n.case.groups = 2, select.gene = "R1",
GRR.multiplicative.factor=2)
#Simulation of one group of 1,000 controls and two groups of 500 cases,
#each one with a prevalence of 0.001
#with 50% of causal variants, 5 genomic regions are simulated.
x <- rbm.GRR(genes.maf = Kryukov, size = c(1000, 500, 500),
prev = c(0.001, 0.001), GRR.matrix.del = GRR.del,
p.causal = 0.5, p.protect = 0, select.gene="R1",
same.variant = FALSE,
genetic.model = "multiplicative", replicates = 5)