Pfreqs {poolHelper} | R Documentation |
Compute allele frequencies from pooled sequencing data
Description
Computes the frequency of the alternative allele in Pool-seq data and removes any site with too few minor-allele reads from both the pool frequencies and the frequencies computed directly from genotypes.
Usage
Pfreqs(reference, alternative, coverage, min.minor, ifreqs)
Arguments
reference |
a matrix with the number of reference allele reads. Each row should be a different population and each column a different site. |
alternative |
a matrix with the number of alternative allele reads. Each row should be a different population and each column a different site. |
coverage |
a matrix with the total coverage. Each row should be a different population and each column a different site. |
min.minor |
is an integer representing the minimum allowed number of minor-allele reads. Sites that, across all populations, have less minor-allele reads than this threshold will be removed from the data. |
ifreqs |
a vector of allele frequencies computed directly from the genotypes where each entry corresponds to a different site. |
Details
The frequency at a given SNP is calculated according to: pi = c/r
, where c
= number of alternative allele reads and r = total number of observed reads.
Additionally, if a site has less minor-allele reads than min.minor
across all populations, that site is removed from the data.
Value
a list with two entries. The ifreqs
entry contains the allele
frequencies computed directly from genotypes and pfreqs
the allele
frequencies computed from pooled sequencing data.
Examples
set.seed(10)
# create a vector of allele frequencies
freqs <- runif(20)
set.seed(10)
# create a matrix with the number of reads with the alternative allele
alternative <- matrix(sample(x = c(0,5,10), size = 20, replace = TRUE), nrow = 1)
# create a matrix with the depth of coverage
coverage <- matrix(sample(100:150, size = 20), nrow = 1)
# the number of reads with the reference allele is obtained by subtracting
# the number of alternative allele reads from the depth of coverage
reference <- coverage - alternative
# compute allele frequencies from pooled sequencing data
Pfreqs(reference = reference, alternative = alternative, coverage = coverage,
min.minor = 2, ifreqs = freqs)