mdf {dmm}R Documentation

Prepare a dataframe for use with dmm function

Description

The function mdf() converts an R dataframe to one which meets the requirements of function dmm(), and may optionally append to that dataframe one or more relationship matrices obtained using package nadiv. Conversion involves renumbering pedigree Id's, removing duplicates, adding base animals, setting up columns to be fixed factors, putting multivariate traits into a matrix, defining the heterogametic sex, and optionally calling nadiv functions to append relationship matrices.

Usage

mdf(df, pedcols = c(1:3), factorcols = NULL, ycols = NULL, sexcode = NULL,
    keep = F, relmat = NULL)

Arguments

df

A dataframe object with columns labelled:

Id

An identifier for each individual

SId

An identifier for each sire

DId

An identifier for each dam

Sex

A coding for sex of each individual

Fixed effect names

Codings for each fixed effect

Observation names

Numerical values for each trait

pedcols

A vector specifying which columns of df contain the pedigree information (ie Id, SId, and DId). The vector can contain either column numbers, or column names. The dafault is c(1:3).

factorcols

A vector specifying which columns of df contain codes for factors which are to be used as either fixed effects or in defining cohort. The default is NULL.

ycols

A vector specifying which columns of df contain observations which are to become traits in a matrix. The default is NULL. The matrix is always called 'Ymat'.

sexcode

A vector of length 2 specifying the codings used for Sex, with the heterogametic sex code given first position. This should always be specified. The default is NULL. If the Sex column in the dataframe df is a character vector, then sexcode should be a charcter vector. If the Sex column in the dataframe df is an integer vector, then sexcode should be an integer vector. If the Sex column in the dataframe df is a character vector coerced to a factor, then sexcode should be a charcter vector. If the Sex column in the dataframe df is an integer vector coerced to a factor, then sexcode should be an integer vector.

keep

A logical variable. Are columns not specified by pedcols, factorcols, or ycols to be retained in the output object? Default is FALSE - ie unused columns are discarded.

relmat

A vector listing the relationship matrices to be generated and appended to the dataframe thus creating a return object of class mdf. Each relationship matrix has a code letter or name as follows:

"E"

An environmental correlation matrix. At present this produces an identity matrix - ie no environmental correlation effects. Must always be included.

"A"

Additive genetic relationship matrix.

"D"

Dominance relationship matrix.

"Dsim"

Dominance relationship matrix by the simulation method (see nadiv).

"AA"

Additive x additive epistatic relationship matrix.

"AD"

Additive x dominance epistatic relationship matrix.

"DD"

Dominance x dominance relationship matrix.

"S"

Sex linked additive genetic relationship matrix with no global dosage compensation ('ngdc' option see nadiv)

"S.hori"

Sex linked additive genetic relationship matrix with 'hori' dosage compensation model ( see nadiv)

"S.hedo"

Sex linked additive genetic relationship matrix with 'hedo' dosage compensation model ( see nadiv)

"S.hoha"

Sex linked additive genetic relationship matrix with 'hoha' dosage compensation model ( see nadiv)

"S.hopi"

Sex linked additive genetic relationship matrix with 'hopi' dosage compensation model ( see nadiv)

Default is NULL - ie no relationship matrices constructed.

Details

If planning to use numerical observations as covariates in the fixed effects model under dmm() use argument keep=TRUE, so that the covariate columns are retained in the returned dataframe object.

The following actions are performed by mdf():

Value

The return object is of class mdf if relationship matrices are requested, and is of class dataframe if relationship matrices are not requested.

An object of class mdf is a list containing the following items:

df

A dataframe conforming to the requirements of function dmm()

rel

A list of relationship matrices

An object of class dataframe as returned by function mdf() is a dataframe conforming to the requirements of function dmm()

Note

Individuals which appear in the SId or DId columns, but not in the Id column are assumed to be 'base individuals', ie they have unknown sire and dam. They will be given an Id and added to the dataframe, but their SId and DId and all data except for Sex coding will be set to NA, so they will be assumed unrelated and will not contribute data. It is important that 'base individuals' be present for relationship matrices to be calculated correctly.

Author(s)

Neville Jackson

See Also

Functions dmm(), pedrenum(). Package nadiv

Examples

library(dmm)

# prepare a multi-trait dataset from sheep.df
data(sheep.df)
# look at its structure
str(sheep.df)
# needs some work - Id, SId, DId are alphanumeric
#                 - Year is numeric and we want it as a factor
#                 - there are 3 traits (Cww,Diam,Bwt) to put into a trait matrix
sheep.mdf1 <- mdf(sheep.df,pedcols=c(1:3), factorcols=c(4:6), ycols=c(7:9),
             sexcode=c("M","F"))
# note the screen messages - it also had to add 2 base Id's for 2 of the dams
str(sheep.mdf1)
# so it returned a dataframe object with 44 observations
# and one of the columns is a matrix called 'Ymat'

# prepare a dataset requiring relationship matrices
sheep.mdf2 <- mdf(sheep.df,pedcols=c(1:3), factorcols=c(4:6), ycols=c(7:9),
             sexcode=c("M","F"),relmat=c("E","A"))
# note the screen messages - it now makes an object of class mdf
str(sheep.mdf2)
# so it returned a list object with 2 items
#    df - the dataframe
#   rel - a list of relationship matrices ( note those not requested are NULL)
#
 

[Package dmm version 2.1-9 Index]