imp {dotgen} | R Documentation |
Impute missing genotypes
Description
Impute missing genotype calls with values inferred from non-missing ones.
Usage
imp_avg(g, ...)
imp_cnd(g, ...)
Arguments
g |
genotype matrix, one row per sample, and one column per variant. |
... |
additional parameters. |
Details
A seemingly naive way to impute a missing value is to use the average of all
non-missing values per variant, imp_avg()
. Besides simplicity, this
imputation by average has the advantage of approximating the correlation
among test statistics (i.e., Z-scores) when the original association
analyses were performed with missing values unfilled, which is a common
practice. This naive approach is the defualt for the correlation calculator
cst()
.
An advanced imputation approach is based on the conditional expectation
method, imp_cnd()
, that explores the relationship between variants and
borrows information from variants other than the target one when making
guesses. The sample correlation among variants imputed this way is closer
to the true LD, and may improve power. However, after this imupation one
must re-run the association analyses with imputed variants to avoid
inflation in Type I error rates.
Value
imputed genotype matrix without any missing values.
Functions
-
imp_avg
: imputation by average. -
imp_cnd
: imputation by conditional expectation