dmm {dmm} | R Documentation |
Fit a dyadic mixed model to pedigree data
Description
Dyadic mixed model analysis with multi-trait responses and pedigree-based partitioning of individual variation into a range of environmental and genetic variance components for individual and maternal effects.
Usage
dmm(mdf, fixform = Ymat ~ 1, components = c("VarE(I)", "VarG(Ia)"),
specific.components=NULL, cohortform = NULL, posdef = T, gls = F,
glsopt = list(maxiter = 200, bdamp = 0.8, stoptol = 0.01),
dmeopt = "qr", ncomp.pcr = "rank", relmat = "inline", dmekeep = F,
dmekeepfit = F, traitspairwise=F, traitsblockwise=F,...)
## Default S3 method:
dmm(mdf, fixform = Ymat ~ 1,
components = c("VarE(I)", "VarG(Ia)"),
specific.components=NULL, cohortform = NULL, posdef = T, gls = F,
glsopt = list(maxiter = 200, bdamp = 0.8, stoptol = 0.01),
dmeopt = "qr", ncomp.pcr = "rank", relmat = "inline", dmekeep = F,
dmekeepfit = F, traitspairwise=F, traitsblockwise=F,...)
Arguments
mdf |
Either a dataframe or an object of class |
fixform |
A formula specifying the fixed-effect part of the model and the response variate(s). Response should be a matrix for multi-trait models. Default is |
components |
A simple vector specifying each of the components to be partitioned from the residual variation after fitting fixed effects. Residual is assumed to be the level of variation attributed to individuals. The components given here are assumed to sum to phenotypic variance so that if there are cross-effect covariances (eg "CovG(Ia,Ma)" and "CovG(Ma,Ia)") they need to be both present. The default is |
specific.components |
A list specifying the name of each specific factor and the variance components which are to be specific to that factor. The list takes the form
The default is If one wishes all of the specified components to be class specific, one needs to specify |
cohortform |
A formula specifying the effects which define cohort grouping of individuals. For example |
posdef |
A logical flag: should the matrices of variance components be constrained to be positive definite? If TRUE each matrix of cross-trait (co)variances for each "Varxxx" component defined in |
gls |
A logical flag: should |
glsopt |
A list object containing variables used to control the GLS iteration :
The GLS iteration normally converges very rapidly. If it does not, consider changing the model, before fiddling with these parameters. |
dmeopt |
One of four regression techniques used to solve the dyadic model equations (DME's) to estimate components:
If gls=TRUE the same |
ncomp.pcr |
Number of principal components to use during a principal components regression (see |
relmat |
One of two ways of setting up the relationship matrices required to estimate the variance components:
|
dmekeep |
Logical flag: should the dyadic model equations be returned as part of the |
dmekeepfit |
Logical flag: should the fit object from solving the DME's be returned as part of the |
traitspairwise |
Logical flag: should the traits be analysed two at a time in all permutations? Default is FALSE, in which case traits are all analysed simultaneously. This option is useful if traits have different replication. If this option is TRUE |
traitsblockwise |
Logical flag: should the traits be analysed in defined blocks of traits in all permutations of pairs of blocks? Default is FALSE. This option is useful if blocks of traits have different replication. If this option is TRUE, the ellipsis option defining blocks must be present. For this option the dataframe specified as argument |
... |
Ellipsis argument: if
, specifying the traits to be present in each block. In this case |
Details
The minimum requirement to use dmm()
directly on a simple dataframe is that it contain columns named "Id", "SId", "DId", and "Sex" plus any fixed effects and traits. The "Id" column must contain identifiers which are unique, numeric, and sequential ( ie they must be numbered 1 to n with unit increments, no duplicates and no gaps). Any fixed effects must be factors, and traits must be numeric. Every "SId" and "DId" code must appear also in the "Id" column even if this results in NA's in every other column. If these requirements are not met, process the dataframe with mdf()
before using dmm()
Also if any relationship matrix other than additive is required, pre-processing with mdf()
is necessary.
Missing values for either traits or fixed effects are simply omitted by dmm()
before any processing. There is an heirarchy of models fitted by dmm()
. There is one fixed model and one dyadic model, for all traits, and only individuals for which all traits are present are included in the model fit steps. In contrast, all individuals are included in the pedigree and in setting up relationship matrices. Hence the number of individuals with data, may be less than the number of individuals in the pedigree. If options traitspairwise
or traitsblockwise
are used both the fixed model and the dyadic model will be the same for all traitpairs (or traitblocks) but the replication may differ, so missing values for some traits or sets of traits can be handled in this way.
The (co)variance which is partitioned into components is always the residual (co)variance from the fixed effects model. This is assumed to represent the observed variation among individuals. There is, at this stage, no provision for models with more than one error level, so split plot and repeated measures designs are not provided for. There is nothing to stop one formulating the appropriate fixed effects model and doing the aov()
step, but partitioning of any (co)variance other than residual is not at present provided.
The naming conventions for components may seem a little strange. They are designed to be all ASCII and therefore usable by R as rownames or colnames. The function make.ctable()
returns a list of all available components ( as returnobject$all), as well as a spectrum of sublists which are used internally. The available components are fully documented in the pdf file dmmOverview.pdf Section 6. Most of the names are obvious (eg "VarG(Ia)" means variance-genetic-individual-additive). The term individual distinguishes individual or direct genetic or environmental effects from maternal genetic or environmental effects.
It is important for the proper estimation of phenotypic (co)variance that any cross-effect covariance components are fitted in symmetric pairs ( for example "CovE(I,M)" and "CovE(M,I)"). For one trait these will be identical, so the covariance will simply enter twice in the sum, as required. However cross-trait-cross-effect covariances will not be identical and the sum, which is a phenotypic covariance, requires that the symmetric pair be present.
In addition to the value returned, dmm()
makes a number of lines of screen output which show each processing step and some minimal model check numbers.
Value
An object of class dmm
is returned whenever options traitspairwise
and traitsblockwise
are both FALSE (ie a normal multi-trait analysis). This object is basically a list of some or all of the following items:
aov |
An object of class |
mdf |
Not currently used - ignore. |
fixform |
Formula specifying fixed effects fitted. |
b |
Coefficients fitted for fixed effects. |
seb |
Standard errors for fixed effect coefficients. |
vara |
Matrix of (co)variances of individuals after adjusting for fixed effects fitted by OLS. |
totn |
Total number of individuals in the analysis (ie with data) |
degf |
Degrees of freedom remaining after fitting fixed effects. |
dme.wmat |
The dyadic model equations matrix. Present only if dmekeep=TRUE. The name 'wmat' refers to the matrix |
dme.psi |
The dyadic model equations right hand sides (or traits) matrix. Present only if dmekeep=TRUE. The name 'psi' refers to the matrix |
dme.fit |
The fit object from solving dyadic model equations. Its form depends on the |
dme.mean |
Means of columns of dyadic model equation matrix |
dme.var |
Variances of columns of dyadic model equation matrix |
dme.correl |
Correlations between columns of dyadic model equation matrix. |
pcr.loadings |
Loadings from principal component regression. Only present when dmeopt="pcr". |
dmeopt |
Value of |
siga |
Variance component estimates obtained by solving DME's. |
sesiga |
Standard errors of variance component estimates. |
vard |
Residual (co)variance matrix from solving DME's |
degfd |
Degrees of freedom for residual covariance matrix. |
component |
Vector of component names from the |
correlation |
Estimated genetic or environmental correlation coefficient corresponding to each component |
correlation.variance |
Estimated sampling variance of each correlation coefficient. |
correlation.se |
Estimated standard error of each correlation coefficient. |
fraction |
Estimated proportion of variance (relative to the total phenotypic (co)variance)) corresponding to each component. For example the proportion corresponding to component "VarG(Ia)" is the usual additive genetic heritability. |
fraction.variance |
Estimated sampling variance of each proportion. |
fraction.se |
Estimated standard error of each proportion. |
variance.components |
Variance component estimates ( as in siga) but with their total which is phenotypic (co)variance appended. |
variance.components.se |
Standard errors of variance component estimates, including phenotypic (co)variance. |
phenotypic.variance |
Phenotypic (co)variances as a trait x trait matrix. |
phenotypic.variance.se |
Standard errors of phenotypic (co)variances as a matrix. |
observed.variance |
Observed variance, adjusted for fixed effects, in current population. Will differ from phenotypic variance because related animals are correlated. All the estimated components, and their sum ( which is phenotypic variance) are estimates of what the components would be in a population of unrelated individuals. |
call |
The call made to |
gls |
Another list containing all the above items, but for fixed effects fitted by GLS instead of OLS. Will only be present if gls=TRUE and if the gls iteration converged successfully. |
specific |
Another list containing those of the above items which are relevant when class-specific (co)varianc components are estimated. Will only be present if the argument |
If option traitspairwise
is TRUE, the value returned by dmm()
is an object of class dmmarray
, which is an array of which each element is an object of class dmm
representing an analysis for one pair of traits. The rows and columns of the array are named using trait names.
If option traitsblockwise
is TRUE, the value returned by dmm()
is an object of class dmmblockarray
, which is a list of two items named array
and blocks
. List item array
is an array of which each element is an object of class dmm
representing an analysis for one pair of blocks of traits. The rows and columns of the array are named using block names. List item blocks
is a list with one element per block, each containig the set of trait names present in the block.
The functions condense.dmmarray
and condense.dmmblockarray
are available to facilitate recombining an array of dmm
objects into a single object of class dmm
with the variance component and genetic parameter estimates appropriately assembled into multi-trait matrices. For example, one would use these functions to prepare an input file for function gresponse
.
Note
Two methods of estimating fixed effects are offered by dmm()
- termed OLS-b and GLS-b.
OLS-b is computationally simple and non-iterative and is the default. Use OLS-b for preliminary runs until the set of components to be estimated from the dyadic model equations is settled. Use GLS-b for the final run.
OLS-b leads to MINQUE estimates of the variance components and OLS estimates of the fixed effects. GLS-b leads to bias-corrected-ML estimates of the variance components , and GLS estimates of the fixed effects.
The notation for (co)variance components was designed to use ASCII characters only so that it could be usable as dimnames in R. Some examples should make it clear
- "VarE(Ia)"
variance environmental individual additive
- "VarG(Ia)"
variance genetic individual additive
- "VarG(Ma)"
variance genetic maternal additive
- "CovG(Ia,Ma)"
covariance genetic individual additive x maternal additive
- "VarGs(Ia)"
variance genetic sexlinked individual additive
For a full coverage of notation see the document dmmOverview.pdf Section 6.
Author(s)
Neville Jackson
References
The document dmmOverview.pdf has a bibliography of literature references.
See Also
Functions mdf()
, make.ctable()
, condense.dmmarray()
, condense.dmmblockarray()
.
Packages nadiv, robustbase, pls
Examples
library(dmm)
# Prepare the dataset sheep.df
data(sheep.df)
sheep.mdf <- mdf(sheep.df,pedcols=c(1:3),factorcols=c(4:6),ycols=c(7:9),
sexcode=c("M","F"),relmat=c("E","A","D"))
# The above code renumbers the pedigree Id's, makes columns "Year","Tb","Sex"
# into factors,
# assembles columns "CWW",Diam","Bwt" into a matrix (called 'Ymat')
# for multivariate processing,
# and sets up the environmental, additive genetic, and dominance genetic
# relationship matrices.
# a simple model with individual environmenmtal and
# additive genetic components (default)
sheep.fit <- dmm(sheep.mdf, Ymat ~ 1 + Year + Sex,
components=c("VarE(I)","VarG(Ia)"),gls=TRUE)
# view the components and fixed effect coefficients ( 2 traits only)
summary(sheep.fit,traitset=c("Cww","Diam"),gls=TRUE)
# view the genetic parameters
gsummary(sheep.fit,traitset=c("Cww","Diam"))
rm(sheep.df)
rm(sheep.mdf)
rm(sheep.fit)
# Note: sheep.df is a small demo dataset. The results are illustrative,
# but not meaningful.
# for a tutorial and fully documented examples see {\em dmmOverview.pdf}