getK {BGGE}R Documentation

Kernel matrix for GE genomic selection models

Description

Create kernel matrix for GE genomic prediction models

Usage

getK(Y, X, kernel = c("GK", "GB"), setKernel = NULL, bandwidth = 1,
             model = c("SM", "MM", "MDs", "MDe"), quantil = 0.5,
             intercept.random = FALSE)

Arguments

Y

data.frame Phenotypic data with three columns. The first column is a factor for environments, the second column is a factor identifying genotypes, and the third column contains the trait of interest

X

Marker matrix with individuals in rows and markers in columns. Missing markers are not allowed.

kernel

Kernel to be created internally. Methods currently implemented are the Gaussian GK and the linear GBLUP kernel

setKernel

matrix Single kernel matrix in case it is necessary to use a different kernel from GK or GBLUP

bandwidth

vector Bandwidth parameter to create the Gaussian Kernel (GK) matrix. The default for the bandwidth is 1. Estimation of this parameter can be made using a Bayesian approach as presented in Perez-Elizalde et al. (2015)

model

Specifies the genotype \times environment model to be fitted. It currently supported the models SM, MM, MDs and MDe. See Details

quantil

Specifies the quantile to create the Gaussian kernel.

intercept.random

if TRUE, kernel related to random intercept of genotype is included.

Details

The aim is to create kernels to fit GE interaction models applied to genomic prediction. Two standard genomic kernels are currently supported: GB creates a linear kernel resulted from the cross-product of centered and standardized marker genotypes divide by the number of markers p:

GB = \frac{XX^T}{p}

Another alternative is the Gaussian Kernel GK, resulted from:

GK (x_i, x_{i'}) = exp(\frac{-h d_{ii'}^2}{q(d)})

where d_{ii'}^2 is the genetic distance between individuals based on markers scaled by some percentile {q(d)} and bandwidth is the bandwidth parameter. However, other kernels can be provided through setKernel. In this case, arguments X, kernel and h are ignored.

Currently, the supported models for GE kernels are:

These GE genomic models were compared and named by Sousa et al. (2017) and can be increased by using the kernel related to random intercept of genotype through intercept.random.

Value

This function returns a two-level list, which specifies the kernel and the type of matrix. The latter is a classification according to its structure, i. e., if the matrix is dense or a block diagonal. For the main effect (G), the matrix is classified as dense (D). On the other hand, matrices for environment-specific and genotype by environment effect (GE) are considered diagonal block (BD). This classification is used as part of the prediction through the BGGE function.

References

Jarquin, D., J. Crossa, X. Lacaze, P. Du Cheyron, J. Daucourt, J. Lorgeou, F. Piraux, L. Guerreiro, P. Pérez, M. Calus, J. Burgueño, and G. de los Campos. 2014. A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theor. Appl. Genet. 127(3): 595-607.

Lopez-Cruz, M., J. Crossa, D. Bonnett, S. Dreisigacker, J. Poland, J.-L. Jannink, R.P. Singh, E. Autrique, and G. de los Campos. 2015. Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3: Genes, Genomes, Genetics. 5(4): 569-82.

Perez- Elizalde, S. J. Cuevas, P. Perez-Rodriguez, and J. Crossa. 2015. Selection of the Bandwidth Parameter in a Bayesian Kernel Regression Model for Genomic-Enabled Prediction. Journal of Agricultural, Biological, and Environmental Statistics (JABES), 20(4):512-532.

Sousa, M. B., Cuevas, J., Oliveira, E. G. C., Perez-Rodriguez, P., Jarquin, D., Fritsche-Neto, R., Burgueno, J. & Crossa, J. (2017). Genomic-enabled prediction in maize using kernel models with genotype x environment interaction. G3: Genes, Genomes, Genetics, 7(6), 1995-2014.

Examples

# create kernel matrix for model MDs using wheat dataset
library(BGLR)

data(wheat)
X <- scale(wheat.X, scale = TRUE, center = TRUE)
rownames(X) <- 1:599
pheno_geno <- data.frame(env = gl(n = 4, k = 599), 
               GID = gl(n=599, k=1, length = 599*4),
               value = as.vector(wheat.Y))
               
 K <- getK(Y = pheno_geno, X = X, kernel = "GB", model = "MDs")              




[Package BGGE version 0.6.5 Index]