R: Genotype x Environment models using regression kernel

BGGE {BGGE}

R Documentation

Genotype x Environment models using regression kernel

Description

BGGE function fits Bayesian regression for continuous observations through regression kernels

Usage

BGGE(y, K, XF = NULL, ne, ite = 1000, burn = 200, thin = 3, verbose = FALSE, 
            tol = 1e-10, R2 = 0.5)

Arguments

`y`	Vector of data. Should be numeric and NAs are allowed.
`K`	list A two-level list Specify the regression kernels (co-variance matrix). The former is the `Kernel`, where is included the regression kernels. The later is the `Type`, specifying if the matrix is either `D` Dense or `BD` Block Diagonal. A number of regression kernels or random effects to be fitted are specified in this list.
`XF`	matrix Design matrix (`n \times p`) for fixed effects
`ne`	vector Number of genotypes by environment.
`ite`	numeric Number of iterations.
`burn`	numeric Number of iterations to be discarded as burn-in.
`thin`	numeric Thinin interval.
`verbose`	Should iteration history be printed on console? If TRUE or 1 then it is printed, otherwise, if another number $n$ is choosen the history is printed every $n$ times. The default is FALSE.
`tol`	a numeric tolerance level. Eigenvalues lower than `tol` are discarded. Default is 1e-10.
`R2`	the proportion of variance expected to be explained by the regression.

Details

The goal is to fit genomic prediction models for continuous outcomes through Gibbs sampler. BGGE uses a proposal for dimension reduction through an orthogonal transformation of observed data (y) as well as differential shrinkage because of the prior variance assigned to regression parameters. Further details on this approach can be found in Cuevas et al. (2014). The primaty genetic model is

y = g + e

where y is the response, g is the unknown random effect and e is the residual effect. You can specify a number of random effects g, as many as desired, through a list of regression kernels related to each random effect in the argument K. The structure of K is a two level list, where the first element on the second level is the Kernel and the second element is a definition of type of matrix. There are two definitions, either matrix is D (dense) or BD (Block Diagonal). As we make the spectral decomposition on the kernels, for block diagonal matrices, we take advantage of its structure and make decomposition on the submatrices instead of one big matrix. For example, the regression kernels should be an structure like K = list(list(Kernel = G, Type = "D"), list(Kernel = G, Type = "BD")). The definition of one matrix as a block diagonal must be followed by the number of subjects in each submatrix in the block diagonal, present in the ne, which allows sub matrices to be drawn. Some genotype by environment models has the block diagonal matrix type or similar. The genotype x environment deviation matrix in MDs model (Sousa et al., 2017) has the structure of block diagonal. Also, the matrices for environment-specific variance in MDe models (Sousa et al., 2017) if summed, can form a structure of block diagonal, where is possible to extract sub matrices for each environment. In the case of all kernel be of the dense type, ne is ignored.

Value

A list with estimated posterior means of residual and genetic variance component for each term in the linear model and the genetic value predicted. Also the values along with the chains are released.

References

Cuevas, J., Perez-Elizalde, S., Soberanis, V., Perez-Rodriguez, P., Gianola, D., & Crossa, J. (2014). Bayesian genomic-enabled prediction as an inverse problem. G3: Genes, Genomes, Genetics, 4(10), 1991-2001.

Sousa, M. B., Cuevas, J., Oliveira, E. G. C., Perez-Rodriguez, P., Jarquin, D., Fritsche-Neto, R., Burgueno, J. & Crossa, J. (2017). Genomic-enabled prediction in maize using kernel models with genotype x environment interaction. G3: Genes, Genomes, Genetics, 7(6), 1995-2014.

Examples

# multi-environment main genotypic model
library(BGLR)
data(wheat)
X<-wheat.X[1:200,1:600]  # Subset of 200 subjects and 600 markers
rownames(X) <- 1:200
Y<-wheat.Y[1:200,]
A<-wheat.A[1:200,1:200] # Pedigree

GB<-tcrossprod(X)/ncol(X)
K<-list(G = list(Kernel = GB, Type = "D"))
y<-Y[,1]
fit<-BGGE(y = y,K = K, ne = length(y), ite = 300, burn = 100, thin = 2)

# multi-environment main genotypic model
Env <- as.factor(c(2,3)) #subset of 2 environments
pheno_geno <- data.frame(env = gl(n = 2, k = nrow(Y), labels = Env),
                         GID = gl(n = nrow(Y), k = 1,length = nrow(Y) * length(Env)),
                         value = as.vector(Y[,2:3]))

K <- getK(Y = pheno_geno, X = X, kernel = "GB", model = "MM")
y <- pheno_geno[,3]
fit <- BGGE(y = y, K = K, ne = rep(nrow(Y), length(Env)), ite = 300, burn = 100,thin = 1)

[Package BGGE version 0.6.5 Index]