scan_multi_oneqtl {qtl2pleio} | R Documentation |
Perform multivariate, one-QTL model fitting for markers on all chromosomes
Description
The function first discards individuals with one or more missing phenotypes or missing covariates.
It then infers variance components, Vg and Ve. Both Vg and Ve
are d by d covariance matrices. It uses an expectation maximization algorithm, as
implemented in the 'gemma2' R package. 'gemma2' R package is an R implementation of the
GEMMA algorithm for multivariate variance component estimation (Zhou & Stephens 2014 Nature methods).
Note that variance components are fitted on a model that uses the d-variate phenotype
but contains no genetic information. This model does, however,
use the specified covariates (after dropping dependent columns
in the covariates matrix).
These inferred covariance matrices, \hat{Vg}
and \hat{Ve}
,
are then used in subsequent model fitting via
generalized least squares.
Generalized least squares model fitting is applied to every marker on
every chromosome.
For a single marker, we fit the model:
vec(Y) = Xvec(B) + vec(G) + vec(E)
where
G \sim MN(0, K, \hat{Vg})
and
E \sim MN(0, I, \hat{Ve})
where MN
denotes the matrix-variate
normal distribution with three parameters: mean matrix, covariance among rows, and
covariance among columns. vec
denotes the vectorization operation, ie, stacking by columns.
K
is a kinship matrix, typically calculated by leave-one-chromosome-out methods.
Y
is the n by d phenotypes matrix. X
is a block-diagonal nd by fd matrix consisting of
d blocks each of dimension n by f. Each n by f block (on the diagonal) contains a matrix of
founder allele probabilities for the n subjects at a single marker. The off-diagonal blocks
have only zero entries.
The log-likelihood is returned for each model. The outputted object is a tibble with
d + 1 columns. The first d columns specify the markers used in the corresponding model fit, while
the last column specifies the log-likelihood value at that d-tuple of markers.
Usage
scan_multi_oneqtl(
probs_list,
pheno,
kinship_list = NULL,
addcovar = NULL,
cores = 1
)
Arguments
probs_list |
an list of arrays of founder allele probabilities |
pheno |
a matrix of phenotypes |
kinship_list |
a list of kinship matrices, one for each chromosome |
addcovar |
a matrix, n subjects by c additive covariates |
cores |
number of cores for parallelization via parallel::mclapply() |
Value
a tibble with d + 1 columns. First d columns indicate the genetic data (by listing the marker ids) used in the design matrix; last is log10 likelihood
References
Knott SA, Haley CS (2000) Multitrait least squares for quantitative trait loci detection. Genetics 156: 899–911.
Jiang C, Zeng ZB (1995) Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140: 1111-1127.
Zhou X, Stephens M (2014) Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nature methods 11:407-409.
Broman KW, Gatti DM, Simecek P, Furlotte NA, Prins P, Sen S, Yandell BS, Churchill GA (2019) R/qtl2: software for mapping quantitative trait loci with high-dimensional data and multi-parent populations. GENETICS https://www.genetics.org/content/211/2/495.
Examples
# read data
n <- 50
pheno <- matrix(rnorm(2 * n), ncol = 2)
rownames(pheno) <- paste0("s", 1:n)
colnames(pheno) <- paste0("tr", 1:2)
probs <- array(dim = c(n, 2, 5))
probs[ , 1, ] <- rbinom(n * 5, size = 1, prob = 0.2)
probs[ , 2, ] <- 1 - probs[ , 1, ]
rownames(probs) <- paste0("s", 1:n)
colnames(probs) <- LETTERS[1:2]
dimnames(probs)[[3]] <- paste0("m", 1:5)
scan_multi_oneqtl(probs_list = list(probs, probs), pheno = pheno, cores = 1)