R: Scale (variance-based association) test

scaleReg {gJLS2}

R Documentation

Scale (variance-based association) test

Description

This function takes as input the genotype of SNPs (GENO), the SEX (SEX), and a quantitative trait (Y) in a sample population, and possibly additional covariates, such as principal components. The function returns the scale association p-values for each SNP.

Usage

scaleReg(
  GENO,
  Y,
  COVAR = NULL,
  SEX = NULL,
  Xchr = FALSE,
  transformed = FALSE,
  loc_alg = "LAD",
  related = FALSE,
  cov.structure = "corCompSymm",
  clust = NULL,
  genotypic = FALSE,
  origLev = FALSE,
  centre = "median"
)

Arguments

`GENO`	a list of a genotype matrix/vector of SNPs, must contain values 0, 1, 2's coded for the number of reference allele. Alternatively, for imputed genotypes, it could either be a vector of dosage values between 0 and 2, or a list of matrix of genotype probabilities, numerically between 0 and 1 for each genotype. The length/dimension of `GENO` should match that of `Y`, and/or `SEX` and `COVAR`.
`Y`	a vector of quantitative traits, such as human height.
`COVAR`	optional: a vector or matrix of covariates that are used to reduce bias due to confounding, such as age.
`SEX`	optional: the genetic sex of individuals in the sample population, must be a vector of 1 and 2 following the default sex code is 1 for males and 2 for females in PLINK.
`Xchr`	a logical indicator for whether the analysis is for X-chromosome SNPs.
`transformed`	a logical indicating whether the quantitative response `Y` should be transformed using a rank-based method to resemble a normal distribution; recommended for traits with non-symmetric distribution. The default option is `FALSE`.
`loc_alg`	a character indicating the type of algorithm to compute the centre in stage 1; the value is either "OLS", corresponding to an ordinary linear regression under Gaussian assumptions to compute the mean, or "LAD", corresponding to a quantile regression to compute the median. The recommended default option is "LAD". For the quantile regression, the function calls `quantreg::rq` and the median is estimated using either the "br" (smaller samples) or "sfn" (larger samples and sparse problems) algorithm depending the sample size, for more details see `?quantreg::rq`.
`related`	optional: a logical indicating whether the samples should be treated as related; if `TRUE` while no relatedness covariance information is given, it is then estimated under a `cov.structure` and assumes this structure among all within-group errors pertaining to the same pair/cluster if specified using `clust`. This option currently only applies to autosomal SNPs.
`cov.structure`	optional: should be one of standard classes of correlation structures listed in `corClasses` from R package nlme. See `?corClasses`. The most commonly used option is `corCompSymm` for a compound symmetric correlation structure. This option currently only applies to autosomal SNPs.
`clust`	optional: a factor indicating the grouping of samples; it should have at least two distinct values. It could be the family ID (FID) for family studies. This option currently only applies to autosomal SNPs.
`genotypic`	a logical indicating whether the variance homogeneity should be tested with respect to an additively (linearly) coded or non-additively coded `geno_one`. The former has one less degree of freedom than the latter and is the default option. For dosage genotypes without genotypic probabilities, `genotypic` is forced to be `FALSE`.
`origLev`	a logical indicator for whether the reported p-values should also include original Levene's test.
`centre`	a character indicating whether the absolute deviation should be calculated with respect to "median" or "mean" in the traditional sex-specific and Fisher combined Levene's test p-values (three tests) for X-chromosome. The default value is "median". This option applies to sex-specific analysis using original Levene's test (i.e. when `regression`$$=$$`TRUE`).

Value

a vector of Levene's test regression p-values according to the models specified.

Note

We recommend to quantile-normally transform Y to avoid ‘scale-effect’ where the variance values tend to be proportional to mean values when stratified by GENO.

Author(s)

Wei Q. Deng deng@utstat.toronto.edu, Lei Sun sun@utstat.toronto.edu

References

Deng WQ, Mao S, Kalnapenkis A, Esko T, Magi R, Pare G, Sun L. (2019) Analytical strategies to include the X-chromosome in variance heterogeneity analyses: Evidence for trait-specific polygenic variance structure. Genet Epidemiol. 43(7):815-830. doi: 10.1002/gepi.22247. PMID:31332826.

Gastwirth JL, Gel YR, Miao W. (2009). The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice." Statistical Science. 24(3) 343-360, doi: 10.1214/09-STS301.

Soave D, Sun L. (2017). A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty. Biometrics. 73(3):960-971. doi: 10.1111/biom.12651. PMID:28099998.

Examples

N <- 1000
genoDAT <- rbinom(N, 2, 0.3)
sex <- rbinom(N, 1, 0.5)+1
Y <- rnorm(N)
covar <- matrix(rnorm(N*10), ncol=10)

# vanilla example:
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, genotypic=TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE, SEX=sex)

[Package gJLS2 version 0.2.0 Index]