CoverageEstimator {cassandRa} | R Documentation |
Coverage Estimator, using Chao1 Index, Turing-Good or Binomial depending on what is possible
Description
An estimate of the sample coverage, which tries to use the most appropriate method
Usage
CoverageEstimator(x, cutoff = 5, BayesPrior = "Flat")
Arguments
x |
A vector of integers, the observed sample counts |
cutoff |
When to switch from binomial model to Chao1 estimator |
BayesPrior |
Prior to use. Either 'Flat' or 'Jeffereys'. |
Details
Sample coverage is defined as the probability that the next interaction drawn is of a type not yet seen
If the sample size is at or below a cutoff (5) or if all the samples are singletons, this is calculated as the posterior mean of a binomial model using a flat prior (this can be changed to a Jeffereys).
If there are singletons but no doubletons, the Turing-Good estimate is used: c_hat = 1 - (f1/n)
If there are both singletons and doubletons, the Chao1 index is used:
c_hat = 1 -( (f1/n) * ( (f1*(n-1))/((n-1)*(f1+(2*f2))) ) )
Value
c_hat, the estimated coverage. (i.e. 1- C_def)