R: Mixed-model solver

mixed.solve {rrBLUP}

R Documentation

Mixed-model solver

Description

Calculates maximum-likelihood (ML/REML) solutions for mixed models of the form

y = X \beta + Z u + \varepsilon

where \beta is a vector of fixed effects and u is a vector of random effects with Var[u] = K \sigma^2_u. The residual variance is Var[\varepsilon] = I \sigma^2_e. This class of mixed models, in which there is a single variance component other than the residual error, has a close relationship with ridge regression (ridge parameter \lambda = \sigma_e^2 / \sigma^2_u).

Usage

mixed.solve(y, Z=NULL, K=NULL, X=NULL, method="REML", 
        bounds=c(1e-09, 1e+09), SE=FALSE, return.Hinv=FALSE)

Arguments

`y`	Vector (`n \times 1`) of observations. Missing values (NA) are omitted, along with the corresponding rows of X and Z.
`Z`	Design matrix (`n \times m`) for the random effects. If not passed, assumed to be the identity matrix.
`K`	Covariance matrix (`m \times m`) for random effects; must be positive semi-definite. If not passed, assumed to be the identity matrix.
`X`	Design matrix (`n \times p`) for the fixed effects. If not passed, a vector of 1's is used to model the intercept. X must be full column rank (implies `\beta` is estimable).
`method`	Specifies whether the full ("ML") or restricted ("REML") maximum-likelihood method is used.
`bounds`	Array with two elements specifying the lower and upper bound for the ridge parameter.
`SE`	If TRUE, standard errors are calculated.
`return.Hinv`	If TRUE, the function returns the inverse of `H = Z K Z' + \lambda I`. This is useful for `GWAS`.

Details

This function can be used to predict marker effects or breeding values (see examples). The numerical method is based on the spectral decomposition of Z K Z' and S Z K Z' S, where S = I - X (X' X)^{-1} X' is the projection operator for the nullspace of X (Kang et al., 2008). This algorithm generates the inverse phenotypic covariance matrix V^{-1}, which can then be used to calculate the BLUE and BLUP solutions for the fixed and random effects, respectively, using standard formulas (Searle et al. 1992):

BLUE(\beta) = \beta^* = (X'V^{-1}X)^{-1}X'V^{-1}y

BLUP(u) = u^* = \sigma^2_u KZ'V^{-1}(y-X\beta^*)

The standard errors are calculated as the square root of the diagonal elements of the following matrices (Searle et al. 1992):

Var[\beta^*] = (X'V^{-1}X)^{-1}

Var[u^*-u] = K \sigma^2_u - \sigma^4_u KZ'V^{-1}ZK + \sigma^4_u KZ'V^{-1}XVar[\beta^*]X'V^{-1}ZK

For marker effects where K = I, the function will run faster if K is not passed than if the user passes the identity matrix.

Value

If SE=FALSE, the function returns a list containing

$Vu: estimator for \sigma^2_u
$Ve: estimator for \sigma^2_e
$beta: BLUE(\beta)
$u: BLUP(u)
$LL: maximized log-likelihood (full or restricted, depending on method)

If SE=TRUE, the list also contains

$beta.SE: standard error for \beta
$u.SE: standard error for u^*-u

If return.Hinv=TRUE, the list also contains

$Hinv: the inverse of H

References

Kang et al. 2008. Efficient control of population structure in model organism association mapping. Genetics 178:1709-1723.

Endelman, J.B. 2011. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome 4:250-255.

Searle, S.R., G. Casella and C.E. McCulloch. 1992. Variance Components. John Wiley, Hoboken.

Examples

#random population of 200 lines with 1000 markers
M <- matrix(rep(0,200*1000),200,1000)
for (i in 1:200) {
  M[i,] <- ifelse(runif(1000)<0.5,-1,1)
}

#random phenotypes
u <- rnorm(1000)
g <- as.vector(crossprod(t(M),u))
h2 <- 0.5  #heritability
y <- g + rnorm(200,mean=0,sd=sqrt((1-h2)/h2*var(g)))

#predict marker effects
ans <- mixed.solve(y,Z=M)  #By default K = I
accuracy <- cor(u,ans$u)

#predict breeding values
ans <- mixed.solve(y,K=A.mat(M))
accuracy <- cor(g,ans$u)

[Package rrBLUP version 4.6.3 Index]