pgen {poppr} | R Documentation |
Genotype Probability
Description
Calculate the probability of genotypes based on the product of allele frequencies over all loci.
Usage
pgen(gid, pop = NULL, by_pop = TRUE, log = TRUE, freq = NULL, ...)
Arguments
gid |
a genind or genclone object. |
pop |
either a formula to set the population factor from the
|
by_pop |
When this is |
log |
a |
freq |
a vector or matrix of allele frequencies. This defaults to
|
... |
options from correcting rare alleles. The default is to correct allele frequencies to 1/n |
Details
Pgen is the probability of a given genotype occuring in a population assuming HWE. Thus, the value for diploids is
P_{gen} = \left(\prod_{i=1}^m p_i\right)2^h
where p_i
are the allele frequencies and h is the count of the
number of heterozygous sites in the sample (Arnaud-Haond et al. 2007; Parks
and Werth, 1993). The allele frequencies, by default, are calculated using
a round-robin approach where allele frequencies at a particular locus are
calculated on the clone-censored genotypes without that locus.
To avoid issues with numerical precision of small numbers, this function calculates pgen per locus by adding up log-transformed values of allele frequencies. These can easily be transformed to return the true value (see examples).
Value
A vector containing Pgen values per locus for each genotype in the object.
Note
For haploids, Pgen at a particular locus is the allele frequency. This
function cannot handle polyploids. Additionally, when the argument
pop
is not NULL
, by_pop
is automatically TRUE
.
Author(s)
Zhian N. Kamvar, Jonah Brooks, Stacy A. Krueger-Hadfield, Erik Sotka
References
Arnaud-Haond, S., Duarte, C. M., Alberto, F., & SerrĂ£o, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.
Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.
See Also
psex
, rraf
, rrmlg
,
rare_allele_correction
Examples
data(Pram)
head(pgen(Pram, log = FALSE))
## Not run:
# You can also supply the observed allele frequencies
pramfreq <- Pram %>% genind2genpop() %>% tab(freq = TRUE)
head(pgen(Pram, log = FALSE, freq = pramfreq))
# You can get the Pgen values over all loci by summing over the logged results:
pgen(Pram, log = TRUE) %>% # calculate pgen matrix
rowSums(na.rm = TRUE) %>% # take the sum of each row
exp() # take the exponent of the results
# You can also take the product of the non-logged results:
apply(pgen(Pram, log = FALSE), 1, prod, na.rm = TRUE)
## Rare Allele Correction ---------------------------------------------------
##
# If you don't supply a table of frequencies, they are calculated with rraf
# with correction = TRUE. This is normally benign when analyzing large
# populations, but it can have a great effect on small populations. To help
# control this, you can supply arguments described in
# help("rare_allele_correction").
# Default is to correct by 1/n per population. Since the calculation is
# performed on a smaller sample size due to round robin clone correction, it
# would be more appropriate to correct by 1/rrmlg at each locus. This is
# acheived by setting d = "rrmlg". Since this is a diploid, we would want to
# account for the number of chromosomes, and so we set mul = 1/2
head(pgen(Pram, log = FALSE, d = "rrmlg", mul = 1/2)) # compare with the output above
# If you wanted to treat all alleles as equally rare, then you would set a
# specific value (let's say the rare alleles are 1/100):
head(pgen(Pram, log = FALSE, e = 1/100))
## End(Not run)