cst {dotgen}R Documentation

Correlation among association test statistics

Description

Calculates the correlation among genetic association test statistics.

Usage

cst(g, x = NULL)

Arguments

g

matrix of genotype, one row per sample, one column per variant, missing values allowed.

x

matrix of covariates, one row per sample, no missing values allowed.

Details

When no covariates are present in per-variant association analyses, that is, x==NULL, correlation among test statistics is the same as the correlation among variants, cor(g).

With covariates, correlation among test statistics is not the same as cor(g). In this case, cst() takes the generalized inverse of the entire correlation matrix, corr(cbind(g, x)), and then inverts back only the submtarix containing genotype variables, g.

If Z-scores were calculated based on genotypes with some missing values, the correlation among test statistics will be reduced by the amount that can be theoretically derived. It can be shown that this reduced correlation can be calculated by imputing the missing values with the averages of non-missing values. Therefore, by default, cst() fills missing values in each variant with the average of non-missing values in that same variant (i.e., imputation by average, imp_avg()). Other imputation methods are also available (see topic imp for other techniques that may improve power), but note that techniques other than the imputation by average requires one to re-run the association analyses with imputed variants to ensure the correlation among new statistics (i.e., Z-scores) and the correlation among imputed variants are identical. Otherwise, Type I error may be inflated for decorrelation-based methods.

Value

Correlation matrix among association test statistics.

See Also

imp, imp_avg()

Examples

## get genotype and covariate matrices
gno <- readRDS(system.file("extdata", 'rs208294_gno.rds', package="dotgen"))
cvr <- readRDS(system.file("extdata", 'rs208294_cvr.rds', package="dotgen"))

## correlation among association statistics, covariates involved
res <- cst(gno, cvr)
print(res[1:4, 1:4])

## genotype matrix with 2% randomly missing data
g02 <- readRDS(system.file("extdata", 'rs208294_g02.rds', package="dotgen"))
cvr <- readRDS(system.file("extdata", 'rs208294_cvr.rds', package="dotgen"))
res <- cst(g02, cvr)
print(res[1:4, 1:4])


[Package dotgen version 0.1.0 Index]