marker_h2_means {heritability} | R Documentation |
Compute a marker-based estimate of heritability, given genotypic means.
Description
Given a genetic relatedness matrix and genotypic means,
this function computes REML-estimates of the genetic and
residual variance and their standard errors, using the AI-algorithm (Gilmour et al. 1995).
Based on this, heritability estimates and confidence intervals are given
(the estimator h_m^2
in Kruijer et al.).
Usage
marker_h2_means(data.vector, geno.vector, K, Dm=NULL, alpha = 0.05, eps = 1e-06,
max.iter = 100, fix.h2 = FALSE, h2 = 0.5, grid.size=99)
Arguments
data.vector |
A vector of phenotypic observations, typically genotypic means. Needs to be of type numeric. May contain missing values. |
geno.vector |
A vector of genotype labels, either a factor or character. This vector should
correspond to |
K |
A genetic relatedness or kinship matrix, typically marker-based.
Must have row- and column-names corresponding to the levels of |
Dm |
Covariance of the genotypic means contained in data.vector; see details. Should be of class matrix, with row- and column-names corresponding to the levels of |
alpha |
Confidence level, for the 1-alpha confidence intervals. |
eps |
Numerical precision, used as convergence criterion in the AI-algorithm. |
max.iter |
Maximal number of iterations in the AI-algorithm. |
fix.h2 |
Compute the log-likelihood and inverse AI-matrix for a fixed heritability value. Default is |
h2 |
When |
grid.size |
If the AI-algorithm has not converged after |
Details
Given phenotypic observations
Y_{i}
for genotypesi=1,...,n
, the mixed modelY_{i} = \mu + G_i + E_{i}
is assumed. Typically, theY_{i}
are genotypic means or BLUEs obtained from fitting a linear (mixed) model to the raw data, containing several plants or plots for each genotype. The vector of additive genetic effects(G_1,...,G_n)'
follows a multivariate normal distribution with mean zero and covariance\sigma_A^2 K
, where\sigma_A^2
is the additive genetic variance, andK
is a genetic relatedness matrix derived from a dense set of markers. The vector of errors(E_1,...,E_n)'
follows a multivariate normal distribution with mean zero and covariance\sigma_E^2 D_m
, whereD_m
is the covariance of the means obtained from the initial analysis. In case of a completely randomized design withr_i
replicates for genotypesi=1,...,n
,D_m
is diagonal with elements1 / r_i
. Under certain assumptions (see Speed et al. 2012) the marker- or chip-heritabilityh^2 = \sigma_A^2 / (\sigma_A^2 + \sigma_E^2)
equals the narrow-sense heritability.As in the
marker_h2
function, it is assumed that the genetic relatedness matrixK
is scaled such thattrace(P K P) = n - 1
, whereP
is the projection matrixI_n - 1_n 1_n' / n
, for the identity matrixI_n
and1_n
being a column vector of ones. If this is not the case,K
is automatically scaled prior to fitting the mixed model.No covariates can be included, as it is assumed that these are available at plant- or plot level, and accounted for in the genotypic means.
The resulting heritability estatimes are less accurate than those obtained from individual plant or plot data, and the likelihood can be monotone in
h^2 = \sigma_A^2 / (\sigma_A^2 + \sigma_E^2)
. If the AI-algorithm has not converged aftermax.iter
iterations, the likelihood is computed on the grid of heritability values 1/(grid.size
+1),...,grid.size
/(grid.size
+1)As in the
marker_h2
function, confidence intervals for heritability are constructed using the delta-method and the inverse AI-matrix. The delta-method can be applied either directly to the function(\sigma_A^2,\sigma_E^2) -> \sigma_A^2 / (\sigma_A^2 + \sigma_E^2)
or to the function(\sigma_A^2,\sigma_E^2) -> log(\sigma_A^2 / \sigma_E^2)
. In the latter case, a confidence interval forlog(\sigma_A^2 / \sigma_E^2)
is obtained, which is back-transformed to a confidence interval for heritability. This approach (proposed in Kruijer et al.) has the advantage that intervals are always contained in the unit interval.
Value
A list with the following components:
va: REML-estimate of the (additive) genetic variance.
ve: REML-estimate of the residual variance.
h2: Plug-in estimate of heritability:
va / (va + ve)
.conf.int1: 1-alpha confidence interval for heritability.
conf.int2: 1-alpha confidence interval for heritability, obtained by application of the delta method on a logarithmic scale.
inv.ai: The inverse of the average information (AI) matrix.
loglik: The log-likelihood.
loglik.vector: Empty numeric vector if the AI-algorthm converged within
max.iter
iterations. Otherwise it contains the log-likelihood on a grid.
Author(s)
Willem Kruijer.
References
Gilmour et al. Gilmour, A.R., R. Thompson and B.R. Cullis (1995) Average Information REML: An Efficient Algorithm for Variance Parameter Estimation in Linear Mixed Models. Biometrics, volume 51, number 4, 1440-1450.
Kruijer, W. et al. (2015) Marker-based estimation of heritability in immortal populations. Genetics, Vol. 199(2), p. 1-20.
Speed, D., G. Hemani, M. R. Johnson, and D.J. Balding (2012) Improved heritability estimation from genome-wide snps. the American journal of human genetics 91: 1011-1021.
See Also
For marker-based estimation of heritability using individual plant or plot data, see
marker_h2
.
Examples
data(means_LDV)
data(R_matrix_LDV)
data(K_atwell)
out <- marker_h2_means(data.vector=means_LDV$LDV,geno.vector=means_LDV$genotype,
K=K_atwell,Dm=R_matrix_LDV)
# Takes about a minute:
#data(means_LD)
#data(R_matrix_LD)
#out <- marker_h2_means(data.vector=means_LD$LD,geno.vector=means_LD$genotype,
# K=K_atwell,Dm=R_matrix_LD)
# The likelihood is monotone increasing:
#plot(x=(1:99)/100,y=out$loglik.vector,type="l",ylab="log-likelihood",lwd=2,
# main='',xlab='h2',cex.lab=2,cex.axis=2.5)