G.matrix {snpReady} | R Documentation |
Estimation of Genomic Relationship Matrix
Description
It generates four different types of Genomic Relationship Matrix (GRM)
Usage
G.matrix(M, method=c("VanRaden", "UAR", "UARadj", "GK"), format=c("wide", "long"),
plot = FALSE)
Arguments
M |
|
method |
Method to built the GRM. Four methods are currently supported. |
format |
Type of object to be returned. |
plot |
If |
Details
G.matrix provides four different types of relationship matrix. The VanRaden
represents the relationship matrix estimated as proposed by Vanraden (2008):
G = \frac{XX'}{trace(XX')/n}
X
is the centered marker matrix. For any marker locus i
, x_i = m_i - 2p_{i}
where m_i
is the vector of SNP genotypes coded as allele couting (0, 1 and 2).
UAR
is the genomic relationship matrices proposed by Yang et al. (2010) and named as Unified Additive Relationship (UAR). This matrix is then obtained by
G_{UAR} = A_{jk} = \frac{1}{N} \sum_i{A_{ijk}} = \left \{
\begin{array}{ll}
\frac{1}{N} \sum_i{\frac{(x_{ij} - 2p_{i})(x_{ik} - 2p_i)}{2p_i(1-p_i)}}, j \neq k \\
1 + \frac{1}{N} \sum_i{\frac{x_{ij}^{2}(1 + 2p_{i})x_{ij} + 2p_i^{2}}{2p_i(1-p_i)}}, j = k
\end{array}
\right.
where p_i
is the allele frequency at SNP i
and x_{ij}
is the SNP genotype that takes a value of 0, 1 or 2 for the genotype of the j^{th}
individual at SNP i
.
The same authors proposed an adjustment in the original UAR matrix (UARadj
) to reduce the bias in estimation of variance in the relationship in causal loci. Thus:
G_{UARadj} = \left \{
\begin{array}{ll}
\beta A_{jk}, j \neq k \\
1 + \beta(A_{jk} - 1), j = k
\end{array}
\right.
where \beta = 1 - frac{c + 1/N}{var(A_{jk}}
, c is a constant dependent on MAF of causal variants assumed as 0.
GK
represents the Gaussian kernel, obtained by
K (x_i, x_{i'}) = \frac{exp(-d_{ii'}^2)}{quantile(d^2, 0.5)}
where d_{ii'}^2
is the square of euclidian distance between two individuals
The format
argument is the desired output format. For "wide"
, the relationship output produced is in matrix format, with n \times n
dimension.
If "long"
is the chosen format, the inverse of the relationship matrix is computed and converted to a table. In this case, the low triangular part of the relationship matrix
is changed to a table with three columns representing the respective rows, columns, and values (Used mainly by ASReml)
If the relationship matrix is not positive definite, a near positive definite matrix is created and solved, followed by a warning message.
Value
It returns the GRM. If the method is VanRaden
, additive and dominance matrices are produced. Otherwise, only the additive form.
If plot
is TRUE
a heat map of the pairwise relationship is save as pdf into the working directory . Also, a 3D plot with the three first principal components is generated.
References
Pérez-Elizalde, S.,Cuevas, J.; Pérez-Rodríguez, P.; Crossa, J. (2015) Selection of The Bandwidth Parameter in a Bayesian Kernel Regression Model for Genomic-Enabled Prediction. J Agr Biol Envir S, 20-4:512-532
Yang, J., Benyamin, B., McEvoy, B.P., et al (2010) Common SNPs explain a large proportion of the heritability for human height. Nature Genetics 42:565-569
VanRaden, P.M. (2008) Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science, 91:4414-4423
Examples
#(1) Additive and dominance relationship matrix
data(maize.hyb)
x <- G.matrix(maize.hyb, method = "VanRaden", format = "wide")
A <- x$Ga
D <- x$Gd