preseqR.optimal.sequencing {preseqR} | R Documentation |
Optimal amount of sequencing for scWGS
Description
preseqR.optimal.sequencing
predicts the optimal amount of sequencing in
a single-cell whole-genome sequencing (scWGS) experiment based on a shallow sequencing experiment.
Usage
preseqR.optimal.sequencing(n, efficiency=0.05, bin=1e8, r=1, mt=20,
times=30, conf=0.95)
Arguments
n |
A two-column matrix.
The first column is the frequency |
efficiency |
The minimum benefit-cost ratio |
bin |
One unit of sequencing effort. Default is 1e8. |
r |
A positive integer. Default is 1. |
mt |
An positive integer constraining possible rational function approximations. Default is 20. |
times |
The number of bootstrap samples. |
conf |
The confidence level. Default is 0.95 |
Details
preseqR.optimal.sequencing
predicts the optimal amount of sequencing
in a scWGS experiment. The term optimal is interpreted as the maximum
amount of sequencing with its benefit-cost ratio greater than a given threshold.
The benefit-cost ratio is defined as the probability of a new nucleotide in the
genome represented at least r
times when one more base is sequenced.
In order to improve the numeric stability, we use the mean of new nucleotdies
with coverage at least r
in one unit of sequencing effort to approximate the
ratio. The amount of sequences in one unit of sequencing effort is defined by
the variable bin
.
Note that the benefit-cost ratio is not monotonic. The ratio first increases and then decrease as the amount of sequencing increase. To predicte the optimal amount of sequencing, we consider only the areas after the peak, where the ratio starts to decrease.
Value
A vector of three dimensions. The first coordinate is the optimal amount of sequencing. The second and the third coordinates are the lower and upper bound of the confidence interval.
Author(s)
Chao Deng
References
Deng, C., Daley, T., Calabrese, P., Ren, J., & Smith, A.D. (2016). Estimating the number of species to attain sufficient representation in a random sample. arXiv preprint arXiv:1607.02804v3.
Examples
## load library
#library(preseqR)
## import data
# data(SRR611492_5M)
## the optimal amount of sequencing with the benefit-cost ratio greater than
## 0.05 for r = 4
# preseqR.optimal.sequencing(n=SRR611492_5M, efficiency=0.05, bin=1e8, r=4)
## the optimal amount of sequencing with the benefit-cost ratio greater than
## 0.05 for r = 10
# preseqR.optimal.sequencing(n=SRR611492_5M, efficiency=0.05, bin=1e8, r=10)