dot {dotgen}R Documentation

Decorrelation by Orthogonal Transformation (DOT)

Description

dot() decorrelates genetic association test statistics by special symmetric orthogonal transformation.

Usage

dot(Z, C, tol.cor = NULL, tol.egv = NULL, ...)

Arguments

Z

vector of association test statistics (i.e., Z-scores).

C

correlation matrix among the association test statistics, as obtained by cst().

tol.cor

tolerance threshold for the largest correlation absolute value.

tol.egv

tolerance threshold for the smallest eigenvalue.

...

additional parameters.

Details

Genetic association studies typically provide per-variant test statistics that can be converted to asymptotically normal, signed Z-scores. Once those Z-scores are transformed to independent random variables, various methods can be applied to combine them and obtain SNP-set overall association.

dot() uses per-variant genetic association test statistics and the correlation among them to decorrelate Z-scores.

To estimate the correlation among genetic association test statistics, use cst(). If P-values and estimated effects (i.e, beta coefficients) are given instead of test statistics, zsc() can be used to recover the test statistics (i.e., Z-scores).

tol.cor: variants with correlation too close to 1 in absolute value are considered to be collinear and only one of them will be retained to ensure that the LD matrix is full-rank. The maximum value for tolerable correlation is 1 - tol.cor. The default value for tol.cor is sqrt(.Machine$double.eps).

tol.egv: negative and close to zero eigenvalues are truncated from matrix D in ⁠H = EDE'⁠. The corresponding columns of E are also deleted. Note the the dimention of the square matrix H does not change after this truncation. See DOT publication in the reference below for more details on definitions of E and D matrices. The default eigenvalue tolerance value is sqrt(.Machine$double.eps).

A number of methods are available for combining de-correlated P-values, see dot_sst for details.

Value

a list with

References

Vsevolozhskaya, O. A., Shi, M., Hu, F., & Zaykin, D. V. (2020). DOT: Gene-set analysis by combining decorrelated association statistics. PLOS Computational Biology, 16(4), e1007819.

See Also

cst(), zsc(), dot_sst

Examples

## get genotype and covariate matrices
gno <- readRDS(system.file("extdata", 'rs208294_gno.rds', package="dotgen"))
cvr <- readRDS(system.file("extdata", 'rs208294_cvr.rds', package="dotgen"))

## estimate the correlation among association test statistics
sgm <- cst(gno, cvr)

## get the result of genetic association analysis (P-values and effects)
res <- readRDS(system.file("extdata", 'rs208294_res.rds', package="dotgen"))

## recover Z-score statistics
stt <- with(res, zsc(P, BETA))

## decorrelate Z-scores by DOT
result <- dot(stt, sgm)
print(result$X)          # decorrelated statistics
print(result$H)          # orthogonal transformation

## sum of squares of decorrelated statistics is a chi-square
ssq <- sum(result$X^2)
pvl <- 1 - pchisq(ssq, df=result$L)

print(ssq)            # sum of squares = 35.76306
print(pvl)            # chisq P-value =  0.001132132

[Package dotgen version 0.1.0 Index]