preseqR.rSAC {preseqR}R Documentation

Best practice for r-SAC – a fast version

Description

preseqR.rSAC predicts the expected number of species represented at least r times in a random sample based on the initial sample.

Usage

preseqR.rSAC(n, r=1, mt=20, size=SIZE.INIT, mu=MU.INIT)

Arguments

n

A two-column matrix. The first column is the frequency j = 1,2,\dots; and the second column is N_j, the number of species with each species represented exactly j times in the initial sample. The first column must be sorted in an ascending order.

mt

A positive integer constraining possible rational function approximations. Default is 20.

r

A positive integer. Default is 1.

size

A positive double, the initial value of the parameter size in the negative binomial distribution for the EM algorithm. Default value is 1.

mu

A positive double, the initial value of the parameter mu in the negative binomial distribution for the EM algorithm. Default value is 0.5.

Details

preseqR.rSAC combines the nonparametric approach using the rational function approximation and the parametric approach using the zero-truncated negative binomial (ZTNB). For a given initial sample, if the sample is from a heterogeneous population, the function calls ds.rSAC; otherwise it calls ztnb.rSAC. The degree of heterogeneity is measured by the coefficient of variation, which is estimated by the ZTNB approach.

preseqR.rSAC is the fast version of preseqR.rSAC.bootstrap. The function does not provide the confidence interval. To obtain the confidence interval along with the estimates, one should use the function preseqR.rSAC.bootstrap.

Value

The estimator for the r-SAC. The input of the estimator is a vector of sampling efforts t, i.e., the relative sample sizes comparing with the initial sample. For example, t = 2 means a random sample that is twice the size of the initial sample.

Author(s)

Chao Deng

References

Deng, C., Daley, T., Calabrese, P., Ren, J., & Smith, A.D. (2016). Estimating the number of species to attain sufficient representation in a random sample. arXiv preprint arXiv:1607.02804v3.

Examples

## load library
library(preseqR)

## import data
data(FisherButterfly)

## construct the estimator for SAC
estimator1 <- preseqR.rSAC(FisherButterfly, r=1)
## The number of species represented at least once in a sample, 
## when the sample size is 10 or 20 times of the initial sample
estimator1(c(10, 20))

## construct the estimator for r-SAC
estimator2 <- preseqR.rSAC(FisherButterfly, r=2)
## The number of species represented at least twice in a sample, 
## when the sample size is 50 or 100 times of the initial sample
estimator2(c(50, 100))

[Package preseqR version 4.0.0 Index]