cst {dotgen} | R Documentation |
Correlation among association test statistics
Description
Calculates the correlation among genetic association test statistics.
Usage
cst(g, x = NULL)
Arguments
g |
matrix of genotype, one row per sample, one column per variant, missing values allowed. |
x |
matrix of covariates, one row per sample, no missing values allowed. |
Details
When no covariates are present in per-variant association analyses, that is,
x==NULL
, correlation among test statistics is the same as the correlation
among variants, cor(g)
.
With covariates, correlation among test statistics is not the same as
cor(g)
. In this case, cst()
takes the generalized inverse of the entire
correlation matrix, corr(cbind(g, x))
, and then inverts back only the
submtarix containing genotype variables, g
.
If Z-scores were calculated based on genotypes with some missing values, the
correlation among test statistics will be reduced by the amount that can be
theoretically derived. It can be shown that this reduced correlation can be
calculated by imputing the missing values with the averages of non-missing
values. Therefore, by default, cst()
fills missing values in each variant
with the average of non-missing values in that same variant (i.e.,
imputation by average, imp_avg()
). Other imputation methods are also
available (see topic imp for other techniques that may improve power), but
note that techniques other than the imputation by average requires one to
re-run the association analyses with imputed variants to ensure the
correlation among new statistics (i.e., Z-scores) and the correlation among
imputed variants are identical. Otherwise, Type I error may be inflated for
decorrelation-based methods.
Value
Correlation matrix among association test statistics.
See Also
Examples
## get genotype and covariate matrices
gno <- readRDS(system.file("extdata", 'rs208294_gno.rds', package="dotgen"))
cvr <- readRDS(system.file("extdata", 'rs208294_cvr.rds', package="dotgen"))
## correlation among association statistics, covariates involved
res <- cst(gno, cvr)
print(res[1:4, 1:4])
## genotype matrix with 2% randomly missing data
g02 <- readRDS(system.file("extdata", 'rs208294_g02.rds', package="dotgen"))
cvr <- readRDS(system.file("extdata", 'rs208294_cvr.rds', package="dotgen"))
res <- cst(g02, cvr)
print(res[1:4, 1:4])