AddGenotypeLikelihood {polyRAD} | R Documentation |
Estimate Genotype Likelihoods in a RADdata object
Description
For each possible allele copy number across each possible ploidy in each taxon,
AddGenotypeLikelihood
estimates the probability of observing the
distribution of read counts that are recorded for that taxon and locus.
AddDepthSamplingPermutations
is called by AddGenotypeLikelihood
the first time it is run, so that part of the likelihood calcluation is
stored in the RADdata
object and doesn't need to be re-run on each
iteration of the pipeline functions.
Usage
AddGenotypeLikelihood(object, ...)
## S3 method for class 'RADdata'
AddGenotypeLikelihood(object, overdispersion = 9, ...)
AddDepthSamplingPermutations(object, ...)
Arguments
object |
A |
overdispersion |
An overdispersion parameter. Higher values will cause the expected read depth distribution to more resemble the binomial distribution. Lower values indicate more overdispersion, i.e. sample-to-sample variance in the probability of observing reads from a given allele. |
... |
Other arguments; none are currently used. |
Details
If allele frequencies are not already recorded in object
, they will
be added using AddAlleleFreqHWE
. Allele frequencies are then
used for estimating the probability of sampling an allele from a genotype due
to sample contamination. Given a known genotype with x
copies of
allele i
, ploidy k
, allele frequency p_i
in the population used for
making sequencing libraries, and contamination rate c
, the probabiity of
sampling a read r_i
from that locus corresponding to that allele is
P(r_i | x) = \frac{x}{k} * (1 - c) + p_i * c
To estimate the genotype likelihood, where nr_i
is the number of reads
corresponding to allele i
for a given taxon and locus and nr_j
is the
number of reads corresponding to all other alleles for that taxon and locus:
P(nr_i, nr_j | x) = {{nr_i + nr_j}\choose{nr_i}} * \frac{B[P(r_i | x) * d + nr_i, [1 - P(r_i | x)] * d + nr_j]]}{B[P(r_i | x) * d, [1 - P(r_i | x)] * d]}
where
{{nr_i + nr_j}\choose{nr_i}} = \frac{(nr_i + nr_j)!}{nr_i! * nr_j!}
B is the beta function, and d
is the overdispersion parameter set by
overdispersion
. {{nr_i + nr_j}\choose{nr_i}}
is calculated by AddDepthSamplingPermutations
.
Value
A "RADdata"
object identical to that passed to the function, but with
genotype likelihoods stored in object$genotypeLikelihood
. This item is a
two dimensional list, with one row for each ploidy listed
in object$possiblePloidies
, ignoring differences between
autopolyploids and allopolyploids, and one column for each ploidy listed in
object$taxaPloidy
. Each item in the list is a three-dimensional
array with number of allele copies in the first dimension, taxa in the second dimension,
and alleles in the third dimension.
Author(s)
Lindsay V. Clark
See Also
Examples
# load example dataset and add allele frequency
data(exampleRAD)
exampleRAD <- AddAlleleFreqHWE(exampleRAD)
# estimate genotype likelihoods
exampleRAD <- AddGenotypeLikelihood(exampleRAD)
# inspect the results
# the first ten individuals and first two alleles, assuming diploidy
exampleRAD$alleleDepth[1:10,1:2]
exampleRAD$genotypeLikelihood[[1]][,1:10,1:2]
# assuming tetraploidy
exampleRAD$genotypeLikelihood[[2]][,1:10,1:2]