remove_realReads {poolABC} | R Documentation |
Remove sites, according to their coverage, from real data
Description
Removes sites that have too many or too few reads from the dataset.
Usage
remove_realReads(nPops, data, minimum, maximum)
Arguments
nPops |
is an integer representing the total number of populations in the dataset. |
data |
is a dataset containing information about real populations. This dataset should have lists with the allelic frequencies, the position of the SNPs, the range of the contig, the number of major allele reads, the number of minor allele reads and the depth of coverage. |
minimum |
the minimum depth of coverage allowed i.e. sites where the depth of coverage of any population is below this threshold are removed. |
maximum |
he maximum depth of coverage allowed i.e. sites where the depth of coverage of any population is above this threshold are removed. |
Details
The minimum
and maximum
inputs define, respectively, the minimum and
maximum allowed coverage for the dataset. The coverage of each population at
each site is compared with those threshold values and any site, where the
coverage of at least one population is below or above the user defined
threshold, is completely removed from the dataset.
Value
a list with the following elements:
freqs |
a list with the allele frequencies, computed by dividing the number of minor-allele reads by the total coverage. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population. |
positions |
a list with the positions of each SNP. Each entry of this list is a vector corresponding to a different contig. |
range |
a list with the minimum and maximum SNP position of each contig. Each entry of this list is a vector corresponding to a different contig. |
rMajor |
a list with the number of major-allele reads. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population. |
rMinor |
a list with the number of minor-allele reads. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population. |
coverage |
a list with the total coverage. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population. |
This output is identical to the data
input, the only difference being the
removal of sites with too many or too few reads.
Examples
# load the data from one rc file
data(rc1)
# clean and organize the data in this single file
mydata <- cleanData(file = rc1, pops = 7:10)
# organize the information by contigs
mydata <- prepareFile(data = mydata, nPops = 4)
# remove sites with less than 10 reads or more than 180
remove_realReads(nPops = 4, data = mydata, minimum = 10, maximum = 180)