genotypeProbs {polysat} | R Documentation |
Calculate Probabilities of Unambiguous Genotypes
Description
Given an ambiguous genotype and either a data frame of allele
frequencies or a vector of genotype probabilities,
genotypeProbs
calculates all possible unambiguous genotypes and
their probabilities of being the true genotype.
Usage
genotypeProbs(object, sample, locus, freq = NULL, gprob = NULL,
alleles = NULL)
Arguments
object |
An object of class |
sample |
Number or character string indicating the sample to evaluate. |
locus |
Character string indicating the locus to evaluate. |
freq |
A data frame of allele frequencies, such as that produced by
|
gprob |
A vector of genotype probabilities based on allele frequencies and
selfing rate. This is generated by |
alleles |
An integer vector of all alleles. This argument should only be used
if |
Details
This function is primarily designed to be called by
meandistance.matrix2
, in order to calculate distances between all
possible unambiguous genotypes. Ordinary users won't use
genotypeProbs
unless they are designing a new analysis.
The genotype analyzed is Genotype(object, sample, locus)
. If
the genotype is unambiguous (fully heterozygous or homozygous), a single
unambiguous genotype is returned with a probability of one.
If the genotype is ambiguous (partially heterozygous), a recursive algorithm is used to generate all possible unambiguous genotypes (all possible duplications of alleles in the genotype, up to the ploidy of the individual.)
If the freq
argument is supplied:
The probability of each unambiguous genotype is then calculated from the allele frequencies of the individual's population, under the assumption of random mating. Allele frequencies are normalized so that the frequencies of the alleles in the ambiguous genotype sum to one; this converts each frequency to the probability of the allele being present in more than one copy. The product of these probabilities is multiplied by the appropriate polynomial coefficient to calculate the probability of the unambiguous genotype.
p = \prod_{i=1}^n f_{i}^{c_i} * \frac{(k-n)!}{\prod_{i=1}^n c_i!}
where p is the probability of the unambiguous genotype, n is the number of alleles in the ambiguous genotype, f is the normalized frequency of each allele, c is the number of duplicated copies (total number of copies minus one) of the allele in the unambiguous genotype, and k is the ploidy of the individual.
If the gprob
and alleles
arguments are supplied:
The probabilities of all possible genotypes in the population have
already been calculated, based on allele frequencies and selfing rate.
This is done in meandistance.matrix2
using code from De Silva
et al. (2005).
Probabilities for the genotypes of interest (those that the ambiguous
genotype could represent) are normalized to sum to 1, in order to give
the conditional probabilities of the possible genotypes.
Value
probs |
A vector containing the probabilities of each unambiguous genotype. |
genotypes |
A matrix. Each row represents one genotype, and the number of columns is equal to the ploidy of the individual. Each element of the matrix is an allele. |
Author(s)
Lindsay V. Clark
References
De Silva, H. N., Hall, A. J., Rikkerink, E., and Fraser, L. G. (2005) Estimation of allele frequencies in polyploids under certain patterns of inheritance. Heredity 95, 327–334
See Also
Examples
# get a data set and define ploidies
data(testgenotypes)
Ploidies(testgenotypes) <- c(8,8,8,4,8,8,rep(4,14))
# get allele frequencies
tfreq <- simpleFreq(testgenotypes)
# see results of genotypeProbs under different circumstances
Genotype(testgenotypes, "FCR7", "RhCBA15")
genotypeProbs(testgenotypes, "FCR7", "RhCBA15", tfreq)
Genotype(testgenotypes, "FCR10", "RhCBA15")
genotypeProbs(testgenotypes, "FCR10", "RhCBA15", tfreq)
Genotype(testgenotypes, "FCR1", "RhCBA15")
genotypeProbs(testgenotypes, "FCR1", "RhCBA15", tfreq)
Genotype(testgenotypes, "FCR2", "RhCBA23")
genotypeProbs(testgenotypes, "FCR2", "RhCBA23", tfreq)
Genotype(testgenotypes, "FCR3", "RhCBA23")
genotypeProbs(testgenotypes, "FCR3", "RhCBA23", tfreq)