MLM Trials {NAM} | R Documentation |
Genomic mixed model
Description
This function was developed to solve a mixel model for multi-environmental trials and/or replicated trials when genomic is available. The model includes a semi-parametric term to account for spatial variation when the field layout of the experiment is know. Covariates (fixed effects), genetics (random effect) and spatial term are fitted all in a single step.
Usage
gmm(y,gen,dta=NULL,it=75,bi=25,th=1,model="BRR",...)
Arguments
y |
Numeric vector of phenotypes of length |
gen |
Numeric matrix, with dimension |
dta |
Data frame with |
it |
Integer. Total numeric of MCMC iterations used to fit the model. |
bi |
Integer. Burn-in of MCMC iterations, i.e., number of iteration to be discarted prior to model convergence. |
th |
Integer. Thinning parameter: saves only 1 interation every |
model |
Prediction model: The options are: |
... |
Pass arguments to the function that builds the spatial splines |
Details
The general model is y=Xb+Zu+f(x)+e
, where y
is the response variable, Xb
refers to the fixed effects, Zu
regards the genetic effect, f(x)
represents the field variation, and e
is the vector of residuals. In this model u
is a link function that represents the genetic component, which depends on the model specified.
For whole-genome regression models (BRR or BayesA), u = Ma
, where M
is the matrix of genotypes. For kernel models (RKHS and GBLUP), u=N(0,K\sigma2a)
, where K is either a Gaussian kernel (RKHS) or a linear kernel (GBLUP). To avoid over-representation of genotypes, u
is not weighted according to the number of observations of each genotype.
Unobserved genotypes not provided in dta
but provided in gen
are predicted in the output of the function. Genotypes without genotypic information are transfered to the fixed effect (eg. checks). Missing loci are imputed with the expectation. If dta
is not provided, the function will work as a regular genomic prediction model, so the length of y
must match the number of rows of gen
.
In whole-genome regression models, the regularization of the genetic term is either based on chosen prior (t, Gaussian), Gaussian (from ridge regression) and t (from BayesA). Kernel models (GBLUP and RKHS) are regularized as Gaussian process, which is similar to a ridge regression of Eigenvectors where the regularization of Eigenpairs also relies on the Eigenvalues.
If there is a large number of trials and users acknowledge the necessity of sparse matrices, we recommend installing the Matrix package and run the following code that enables sparsity:
source(system.file("add","sparseGMM.R",package="NAM"))
Value
The function gmm returns a list containing the fitted values (hat
), observed values (obs
), intercept (mu
, incidence matrix of genotypes (Z
) and the vector of breeding values (EBV
). If fixed effects are provided, it also returns the design matrices and coefficients of fixed effects (X
,b
). If the model was kernel or regression, output will include
the random effect coefficients (g
), variance components of markers (Vg
) and residuals (Ve
). Kernel models regress the Eigenvectors of the kernel, weighted by the Eigenvalue. The coefficient (cxx
) used in the BRR
model to convert marker variance Vb
into genetic variance Va
. If spatial information is provided, the output includes the fitted spatial term (sp
).
Author(s)
Alencar Xavier
Examples
## Not run:
# Checking heritability
data(tpod)
fit = gmm(y,gen,model = 'BRR')
fit$Vg*fit$cxx / (fit$Vg*fit$cxx+fit$Ve)
# For a demo with wulti-environmental trial type:
# demo(fittingMET)
## End(Not run)