poolPops {poolHelper} | R Documentation |
Create Pooled DNA sequencing data for multiple populations
Description
This function combines the information for each individual of each population into information at the population level.
Usage
poolPops(nPops, nLoci, indContribution, readsReference)
Arguments
nPops |
An integer representing the total number of populations in the dataset. |
nLoci |
An integer that represents the total number of independent loci in the dataset. |
indContribution |
Either a list or a matrix (when dealing with a single locus). |
readsReference |
A list, where each entry contains the information for a
single locus. Each list entry should then have one separate entry per
population. Each of these entries should be a matrix, with each row
corresponding to a single individual and each column a different site.
Thus, each entry of the matrix contains the number of observed reads with
the reference allele for that individual at a given site. The output of the
|
Details
In other words, the information of all individuals in a given population is
combined into a single population value and this is done for the various
populations. In this situation, each entry of the indContribution
and
readsReference
lists should contain one entry per population - being, in
essence, a list within a list. Please note that this function is intended to
work for multiple populations and should not be used with a single
population.
Value
a list with three names entries
reference |
a list with one entry per locus. Each entry is a matrix with the number of reference allele reads for each population. Each column represents a different site and each row a different population. |
alternative |
a list with one entry per locus. Each entry is a matrix with the number of alternative allele reads for each population. Each column represents a different site and each row a different population. |
total |
a list with one entry per locus. Each entry is a matrix with the coverage of each population. Each column represents a different site and each row a different population. |
Examples
# simulate coverage at 5 SNPs for two populations, assuming 20x mean coverage
reads <- simulateCoverage(mean = c(20, 20), variance = c(100, 100), nSNPs = 5, nLoci = 1)
# simulate the number of reads contributed by each individual
# for each population there are two pools, each with 5 individuals
indContribution <- popsReads(list_np = rep(list(rep(5, 2)), 2), coverage = reads, pError = 5)
# set seed and create a random matrix of genotypes for the 20 individuals - 10 per population
set.seed(10)
genotypes <- matrix(rpois(100, 0.5), nrow = 20)
# simulate the number of reference reads for the two populations
readsReference <- numberReferencePop(genotypes = genotypes, indContribution = indContribution,
size = rep(list(rep(5, 2)), 2), error = 0.01)
# create Pooled DNA sequencing data for these two populations and for a single locus
poolPops(nPops = 2, nLoci = 1, indContribution = indContribution, readsReference = readsReference)