scaleReg {gJLS2}R Documentation

Scale (variance-based association) test

Description

This function takes as input the genotype of SNPs (GENO), the SEX (SEX), and a quantitative trait (Y) in a sample population, and possibly additional covariates, such as principal components. The function returns the scale association p-values for each SNP.

Usage

scaleReg(
  GENO,
  Y,
  COVAR = NULL,
  SEX = NULL,
  Xchr = FALSE,
  transformed = FALSE,
  loc_alg = "LAD",
  related = FALSE,
  cov.structure = "corCompSymm",
  clust = NULL,
  genotypic = FALSE,
  origLev = FALSE,
  centre = "median"
)

Arguments

GENO

a list of a genotype matrix/vector of SNPs, must contain values 0, 1, 2's coded for the number of reference allele. Alternatively, for imputed genotypes, it could either be a vector of dosage values between 0 and 2, or a list of matrix of genotype probabilities, numerically between 0 and 1 for each genotype. The length/dimension of GENO should match that of Y, and/or SEX and COVAR.

Y

a vector of quantitative traits, such as human height.

COVAR

optional: a vector or matrix of covariates that are used to reduce bias due to confounding, such as age.

SEX

optional: the genetic sex of individuals in the sample population, must be a vector of 1 and 2 following the default sex code is 1 for males and 2 for females in PLINK.

Xchr

a logical indicator for whether the analysis is for X-chromosome SNPs.

transformed

a logical indicating whether the quantitative response Y should be transformed using a rank-based method to resemble a normal distribution; recommended for traits with non-symmetric distribution. The default option is FALSE.

loc_alg

a character indicating the type of algorithm to compute the centre in stage 1; the value is either "OLS", corresponding to an ordinary linear regression under Gaussian assumptions to compute the mean, or "LAD", corresponding to a quantile regression to compute the median. The recommended default option is "LAD". For the quantile regression, the function calls quantreg::rq and the median is estimated using either the "br" (smaller samples) or "sfn" (larger samples and sparse problems) algorithm depending the sample size, for more details see ?quantreg::rq.

related

optional: a logical indicating whether the samples should be treated as related; if TRUE while no relatedness covariance information is given, it is then estimated under a cov.structure and assumes this structure among all within-group errors pertaining to the same pair/cluster if specified using clust. This option currently only applies to autosomal SNPs.

cov.structure

optional: should be one of standard classes of correlation structures listed in corClasses from R package nlme. See ?corClasses. The most commonly used option is corCompSymm for a compound symmetric correlation structure. This option currently only applies to autosomal SNPs.

clust

optional: a factor indicating the grouping of samples; it should have at least two distinct values. It could be the family ID (FID) for family studies. This option currently only applies to autosomal SNPs.

genotypic

a logical indicating whether the variance homogeneity should be tested with respect to an additively (linearly) coded or non-additively coded geno_one. The former has one less degree of freedom than the latter and is the default option. For dosage genotypes without genotypic probabilities, genotypic is forced to be FALSE.

origLev

a logical indicator for whether the reported p-values should also include original Levene's test.

centre

a character indicating whether the absolute deviation should be calculated with respect to "median" or "mean" in the traditional sex-specific and Fisher combined Levene's test p-values (three tests) for X-chromosome. The default value is "median". This option applies to sex-specific analysis using original Levene's test (i.e. when regression$$=$$TRUE).

Value

a vector of Levene's test regression p-values according to the models specified.

Note

We recommend to quantile-normally transform Y to avoid ‘scale-effect’ where the variance values tend to be proportional to mean values when stratified by GENO.

Author(s)

Wei Q. Deng deng@utstat.toronto.edu, Lei Sun sun@utstat.toronto.edu

References

Deng WQ, Mao S, Kalnapenkis A, Esko T, Magi R, Pare G, Sun L. (2019) Analytical strategies to include the X-chromosome in variance heterogeneity analyses: Evidence for trait-specific polygenic variance structure. Genet Epidemiol. 43(7):815-830. doi: 10.1002/gepi.22247. PMID:31332826.

Gastwirth JL, Gel YR, Miao W. (2009). The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice." Statistical Science. 24(3) 343-360, doi: 10.1214/09-STS301.

Soave D, Sun L. (2017). A generalized Levene's scale test for variance heterogeneity in the presence of sample correlation and group uncertainty. Biometrics. 73(3):960-971. doi: 10.1111/biom.12651. PMID:28099998.

Examples

N <- 1000
genoDAT <- rbinom(N, 2, 0.3)
sex <- rbinom(N, 1, 0.5)+1
Y <- rnorm(N)
covar <- matrix(rnorm(N*10), ncol=10)

# vanilla example:
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, genotypic=TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE)
scaleReg(GENO=list(genoDAT, genoDAT), Y=Y, COVAR=covar, origLev = TRUE, SEX=sex)


[Package gJLS2 version 0.2.0 Index]