rareGeneTest {thgenetics} | R Documentation |
Rare Variant Tests
Description
Runs tests for rare variants by running the "step-up" approach to choose the best set of variants (possibly signing them), and correcting by permutation. rareGeneTest
tests a single gene, and also allows for other aggregation models described in Hoffmann et al.. rarePathwayTest
provides an extension of this method, that does a step-up approach for inclusions of genes (for computional time). For larger genes, it may also be advantageous to run rarePathwayTest
which will automatically chop large genes into subpieces and run step-up on those pieces.
Usage
rareGeneTest(genotype, phenotype,
use_sign=TRUE, use_weight=TRUE, nperm=1000,
binary=all(phenotype==0 | phenotype==1, na.rm=TRUE),
strategy="step", thresh=1)
rarePathwayTest(genotype, genotype_gene, phenotype,
use_sign=TRUE, use_weight=TRUE, nperm=1000,
binary=all(phenotype==0 | phenotype==1, na.rm=TRUE),
CUT=15)
Arguments
genotype |
Matrix of individuals along the rows and genotypes along the columns. Each entry in the matrix should be coded |
phenotype |
Vector of same length as the number of rows of the matrix. Either continuous, or dichotomous (case= |
use_sign |
Whether to use a sign in the model for w, or just to weight all variants the same. Uses the sign of the estimated covariance between each marker and the trait to determine the sign of the marker. |
use_weight |
Whether to use a weight in the model for w (inverse variance of allele frequency in everyone (continuous) or just the controls (dichotomous)). |
nperm |
Number of permutations to run. |
binary |
Is the phenotype dichotomous, or continuous? (TRUE/FALSE) |
strategy |
‘step’ uses the step-up routine (default, recommended), ‘afreq’ tests all allele frequencies, and ‘thresh’ uses a fixed threshold. |
thresh |
Allele frequency for inclusion, used only when |
genotype_gene |
For |
CUT |
For |
Details
The methods here are as described in the Hoffmann et al. (PLoS ONE, to appear) paper. These methods are based on the idea of trying multiple models for rare variants, since prior information is generally not very accurate. The p-value is corrected for multiple comparisons by permutation.
References
Hoffmann, TJ, Marini, NJ, and Witte, JS. Comprehensive approach to analyzing rare variants. PLoS ONE, 5(11): e13584. https://doi.org/10.1371/journal/pone.0013584.
Examples
## This inefficient (for clarity) code will simulate
## a dataset in the correct format, and then run
## the different approaches described here.
## Constants for data generation
SEED <- 2
NCASE <- NCONT <- 500
NGENE <- 6
FUNCTIONAL <- 1:(NGENE/2) ## Half are functional
BETA0 <- -4
BETAG <- log(2)
set.seed(SEED) ## Reproducible results
nonfunctional <- setdiff(1:NGENE, FUNCTIONAL)
afreq <- runif(NGENE, 0.001, 0.01)
expit <- function(x)
return(exp(x) / (1 + exp(x)))
gcase <- matrix(0, nrow=NCASE, ncol=NGENE)
for(indiv in 1:NCASE){
affected <- FALSE
while(!affected){ ## while not affected
gcase[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype
affected <- (expit(BETA0 +
BETAG*sum(gcase[indiv, FUNCTIONAL])) > runif(1))
}
}
cat("\n")
gcont <- matrix(0, nrow=NCONT, ncol=NGENE)
for(indiv in 1:NCONT){
unaffected <- FALSE
while(!unaffected){ ## while not unaffected
gcont[indiv, ] <- rbinom(NGENE, 2, afreq) ## draw up genotype
unaffected <- (1-expit(BETA0 +
BETAG*sum(gcont[indiv, FUNCTIONAL])) > runif(1))
}
}
cat("\n")
cat("# Rare functional variants cases =",
sum(gcase[,FUNCTIONAL]), "\n")
cat("# Rare functional variants controls =",
sum(gcont[,FUNCTIONAL]), "\n")
cat("# Rare non-functional variants cases =",
sum(gcase[,nonfunctional]), "\n")
cat("# Rare non-functional variants controls =",
sum(gcont[,nonfunctional]), "\n")
case <- c(rep(1,NCASE), rep(0,NCONT))
genotype <- rbind(gcase, gcont)
cat("P-value of the test:\n")
rareGeneTest(genotype, case)