leveneRegX_per_SNP {gJLS2}R Documentation

Levene's regression tests for variance homogeneity by SNP genotype (X-chromosome specific)

Description

This function takes as input the genotype of a SNP (geno_one), the genetic sex (SEX), a quantitative trait (Y) in a sample population, and possibly additional covariates, such as principal components. The function returns the scale association p-values for each X-chromosome SNP using the generalized Levene's test designed for X-chromosome biallelic markers.

Usage

leveneRegX_per_SNP(
  geno_one,
  SEX,
  Y,
  COVAR = NULL,
  genotypic = FALSE,
  transformed = TRUE,
  loc_alg = "LAD"
)

Arguments

geno_one

the genotype of a biallelic SNP, must be a vector of 0, 1, 2's coded for the number of reference allele. Alternatively, for imputed genotypes, it could be a matrix/vector of dosage values, numerically between 0 and 2. The length/dimension of geno_one should match that of Y, and/or SEX and COVAR.

SEX

optional: the genetic sex of individuals in the sample population, must be a vector of 1 and 2 following the default sex code is 1 for males and 2 for females in PLINK.

Y

a vector of quantitative traits, such as human height.

COVAR

optional: a vector or matrix of covariates that are used to reduce bias due to confounding, such as age.

genotypic

optional: a logical indicating whether the variance homogeneity should be tested with respect to an additively (linearly) coded or non-additively coded geno_one. The former has one less degree of freedom than the latter and is the default option. For dosage genotypes without genotypic probabilities, genotypic is forced to be FALSE.

transformed

a logical indicating whether the quantitative response Y should be transformed using a rank-based method to resemble a normal distribution; recommended for traits with non-symmetric distribution. The default option is TRUE.

loc_alg

a character indicating the type of algorithm to compute the centre in stage 1; the value is either "OLS", corresponding to an ordinary linear regression under Gaussian assumptions to compute the mean, or "LAD", corresponding to a quantile regression to compute the median. The recommended default option is "LAD". For the quantile regression, the function calls quantreg::rq and the median is estimated using either the "fn" (smaller samples) or "sfn" (larger samples and sparse problems) algorithm depending the sample size, for more details see ?quantreg::rq.

Value

the Levene's test regression p-value according to the model specified.

Note

We recommend to quantile-normally transform Y to avoid ‘scale-effect’ where the variance values tend to be proportional to mean values when stratified by geno_one.

Author(s)

Wei Q. Deng deng@utstat.toronto.edu, Lei Sun sun@utstat.toronto.edu

References

Deng WQ, Mao S, Kalnapenkis A, Esko T, Magi R, Pare G, Sun L. (2019) Analytical strategies to include the X-chromosome in variance heterogeneity analyses: Evidence for trait-specific polygenic variance structure. Genet Epidemiol. 43(7):815-830. doi: 10.1002/gepi.22247. PMID:31332826.

Gastwirth JL, Gel YR, Miao W. (2009). The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice. Statistical Science. 24(3) 343-360, doi: 10.1214/09-STS301.

Examples

N <- 1000
sex <- rbinom(N, 1, 0.5)+1
Y <- rnorm(N)
genDAT <- NA
genDAT[sex==2] <- rbinom(sum(sex==2), 2, 0.3)
table(genDAT, sex)
genDAT[sex==1] <- rbinom(sum(sex==1), 1, 0.3)
table(genDAT, sex)

leveneRegX_per_SNP(geno_one=genDAT, SEX=sex, Y=Y)
leveneRegX_per_SNP(geno_one=genDAT, SEX=sex, Y=Y, genotypic=TRUE)
leveneRegX_per_SNP(geno_one=genDAT, SEX=sex, Y=Y, loc_alg="OLS")


[Package gJLS2 version 0.2.0 Index]