R: Optimum Contribution Selection and Population Genetics

optiSel-package {optiSel}

R Documentation

Optimum Contribution Selection and Population Genetics

Description

A framework for the optimization of breeding programs via optimum contribution selection and mate allocation. An easy to use set of function for computation of optimum contributions of selection candidates, and of the population genetic parameters to be optimized. These parameters can be estimated using pedigree or genotype information, and include kinships, kinships at native haplotype segments, and breed composition of crossbred individuals. They are suitable for managing genetic diversity, removing introgressed genetic material, and accelerating genetic gain. Additionally, functions are provided for computing genetic contributions from ancestors, inbreeding coefficients, the native effective size, the native genome equivalent, pedigree completeness, and for preparing and plotting pedigrees. The methods are described in:\n Wellmann, R., and Pfeiffer, I. (2009) <doi:10.1017/S0016672309000202>.\n Wellmann, R., and Bennewitz, J. (2011) <doi:10.2527/jas.2010-3709>.\n Wellmann, R., Hartwig, S., Bennewitz, J. (2012) <doi:10.1186/1297-9686-44-34>.\n de Cara, M. A. R., Villanueva, B., Toro, M. A., Fernandez, J. (2013) <doi:10.1111/mec.12560>.\n Wellmann, R., Bennewitz, J., Meuwissen, T.H.E. (2014) <doi:10.1017/S0016672314000196>.\n Wellmann, R. (2019) <doi:10.1186/s12859-018-2450-5>.

Details

Optimum Contribution Selection

After kinships, breeding values and/or native contributions of the selection candidates have been computed, function candes can be used to create an R-object containing all this information. The current average kinships and trait values are estimated by this function, and the available objective functions and constraints for optimum contribution selection are reported. The following function can then be used to compute optimum contributions:

opticont	Calculates optimum genetic contributions of selection candidates to the next generation,
	and checks if all constraints are fulfilled.

Function noffspring can be used to compute the optimum numbers of offspring of selection candidates from their optimum contributions. Function matings can be used for mate allocation.

Kinships

For pairs of individuals the following kinships can be computed:

pedIBD	Calculates pedigree based probability of alleles to be IBD ("pedigree based kinship""),
segIBD	Calculates segment based probability of alleles to be IBD ("segment based kinship"),
pedIBDatN	Calculates pedigree based probability of alleles to be IBD at segments with Native origin,
segIBDatN	Calculates segment based probability of alleles to be IBD at segments with Native origin,
pedIBDorM	Calculates pedigree based probability of alleles to be IBD or Migrant alleles,
segIBDandN	Calculates segment based probability of alleles to be IBD and have Native origin,
segN	Calculates segment based probability of alleles to have Native origin,
makeA	Calculates the pedigree-based additive relationship matrix.

Phenotypes and results from these functions can be combined with function candes into a single R object, which can then be used as an argument to function opticont.

The segment based kinship can be used to calculate the optimum contributions of different breeds to a hypothetical multi-breed population with maximum genetic diversity by using function opticomp.

Function sim2dis can be used to convert a similarity matrix (e.g. a kinship matrix) into a dissimilarity matrix which is suitable for multidimensional scaling.

Breed Composition

The breed composition of crossbred individuals can be accessed with

pedBreedComp	Calculates pedigree based the Breed Composition, which is the genetic contribution
	of each individual from other breeds and from native founders. The native contribution
	is the proportion of the genome not originating from other breeds.
segBreedComp	Calculates segment based the Breed Composition. The native contribution is the
	proportion of the genome belonging to segments that have low frequency in
	other breeds.

The native contributions obtained by the above functions can be constrained or maximized with function opticont to remove introgressed genetic material, or alternatively, the segment-based native contribution can be considered a quantitative trait and included in a selection index.

Haplotype frequencies

Frequencies of haplotype segments in particular breeds can be computed and plotted with

haplofreq	Calculates the maximum frequency each segment has in a set of reference breeds,
	and the name of the breed in which the segment has maximum frequency.
	Identification of native segments.
freqlist	Combines results obtained with function haplofreq for different reference breeds
	into a single R object which is suitable for plotting.
plot.HaploFreq	Plots frequencies of haplotype segments in particular reference breeds.

Inbreeding Coefficients and Genetic Contributions

The inbreeding coefficients and genetic contributions from ancestors can be computed with:

pedInbreeding	Calculates pedigree based Inbreeding.
segInbreeding	Calculates segment based Inbreeding, i.e. inbreeding based on
	runs of homozygosity (ROH).
genecont	Calculates genetic contributions each individual has from all it's ancestors in
	the pedigree.

Preparing and plotting pedigree data

There are some functions for preparing and plotting pedigree data

prePed	prepares a Pedigree by sorting, adding founders and pruning the pedigree,
completeness	Calculates pedigree completeness in all ancestral generations,
summary.Pedig	Calculates number of equivalent complete generations, number of fully
	traced generations, number of maximum generations traced, index of
	pedigree completeness, inbreeding coefficients,
subPed	Creates a subset of a large Pedigree,
pedplot	Plots a pedigree,
sampleIndiv	Samples individuals from a pedigree.

Population Parameters

Finally, there are some functions for estimating population parameters:

conttac	Calculates genetic contributions of breeds to age cohorts,
summary.candes	Calculates for every age cohort several genetic parameters. These may
	include average kinships, kinships at native loci,
	the native effective size, and the native genome equivalent.

Genotype File Format

All functions reading genotype data assume that the files are in the following format:

Genotypes are phased and missing genotypes have been imputed. Each file has a header and no row names. Cells are separated by blank spaces. The number of rows is equal to the number of markers from the respective chromosome and the markers are in the same order as in the map. There can be some extra columns on the left hand side containing no genotype data. The remaining columns contain genotypes of individuals written as two alleles separated by a character, e.g. A/B, 0/1, A|B, A B, or 0 1. The same two symbols must be used for all markers. Column names are the IDs of the individuals. If the blank space is used as separator then the ID of each individual should be repeated in the header to get a regular delimited file. The columns to be skipped and the individual IDs must have no white spaces.

Use function read.indiv to extract the IDs of the individuals from a genotype file.

Author(s)

Robin Wellmann [aut, cre]

Maintainer: Robin Wellmann <r.wellmann@uni-hohenheim.de>

References

de Cara MAR, Villanueva B, Toro MA, Fernandez J (2013). Using genomic tools to maintain diversity and fitness in conservation programmes. Molecular Ecology. 22: 6091-6099

Wellmann, R., and Pfeiffer, I. (2009). Pedigree analysis for conservation of genetic diversity and purging. Genet. Res. 91: 209-219

Wellmann, R., and Bennewitz, J. (2011). Identification and characterization of hierarchical structures in dog breeding schemes, a novel method applied to the Norfolk Terrier. Journal of Animal Science. 89: 3846-3858

Wellmann, R., Hartwig, S., Bennewitz, J. (2012). Optimum contribution selection for conserved populations with historic migration; with application to rare cattle breeds. Genetics Selection Evolution. 44: 34

Wellmann, R., Bennewitz, J., Meuwissen, T.H.E. (2014) A unified approach to characterize and conserve adaptive and neutral genetic diversity in subdivided populations. Genet Res (Camb). 69: e16

Wellmann, R. (2019). Optimum Contribution Selection and Mate Allocation for Breeding: The R Package optiSel. BMC Bioinformatics 20:25

Examples


#See ?opticont for optimum contribution selection 
#These examples demonstrate computation of some population genetic parameters.

data(ExamplePed)
Pedig <- prePed(ExamplePed, thisBreed="Hinterwaelder", lastNative=1970)
head(Pedig)

############################################
# Evaluation of                            #
#    - kinships                            #
#    - genetic diversities                 #
#    - native effective size               #
#    - native genome equivalent            #
############################################

phen    <- Pedig[Pedig$Breed=="Hinterwaelder",]
pKin    <- pedIBD(Pedig)
pKinatN <- pedIBDatN(Pedig, thisBreed="Hinterwaelder")
pop     <- candes(phen=phen, pKin=pKin, pKinatN=pKinatN, quiet=TRUE, reduce.data=FALSE)
Param   <- summary(pop, tlim=c(1970,2005), histNe=150, base=1800, df=4)


plot(Param$t, Param$Ne, type="l", ylim=c(0,150), 
     main="Native Effective Size", ylab="Ne", xlab="")

matplot(Param$t, Param[,c("pKin", "pKinatN")], 
        type="l",ylim=c(0,1),main="Kinships", xlab="Year", ylab="mean Kinship")
abline(0,0)
legend("topleft", legend = c("pKin", "pKinatN"), lty=1:2, col=1:2, cex=0.6)

info <- paste("Base Year =", attributes(Param)$base, "  historic Ne =", attributes(Param)$histNe)

plot(Param$t,Param$NGE,type="l",main="Native Genome Equivalents", 
     ylab="NGE",xlab="",ylim=c(0,7))
mtext(info, cex=0.7)


############################################
# Genetic contributions from other breeds  #
############################################

cont <- pedBreedComp(Pedig, thisBreed='Hinterwaelder')
contByYear <- conttac(cont, Pedig$Born, use=Pedig$Breed=="Hinterwaelder", mincont=0.04, long=FALSE)
round(contByYear,2)

barplot(contByYear, ylim=c(0,1), col=1:10, ylab="genetic contribution",
        legend=TRUE, args.legend=list(x="topleft",cex=0.6))


######################################################
# Frequencies of haplotype segments in other breeds  #
######################################################

data(map)
data(Cattle)
dir   <- system.file("extdata", package="optiSel")
files <- file.path(dir, paste("Chr", 1:2, ".phased", sep=""))

Freq <- freqlist(
  haplofreq(files, Cattle, map, thisBreed="Angler", refBreeds="Rotbunt",   minSNP=20),
  haplofreq(files, Cattle, map, thisBreed="Angler", refBreeds="Holstein",  minSNP=20),
  haplofreq(files, Cattle, map, thisBreed="Angler", refBreeds="Fleckvieh", minSNP=20)
  )

plot(Freq, ID=1, hap=2, refBreed="Rotbunt")

[Package optiSel version 2.0.9 Index]