R: Sampling from a species abundance distribution

rsad {sads}

R Documentation

Sampling from a species abundance distribution

Description

A given number of realizations of a probability distribution (species abundances in a community) is sampled with replacement by a Poisson or Negative Binomial process, or without replacement by a hypergeometric process.

Usage

rsad(S = NULL, frac, sad = c("bs","gamma","geom","lnorm","ls","mzsm","nbinom","pareto",
                             "poilog","power", "powbend", "volkov", "weibull"), 
                 coef, trunc=NaN, sampling=c("poisson", "nbinom", "hypergeometric"), 
                 k, zeroes=FALSE, ssize=1)

Arguments

`S`	positive integer; number of species in the community, which is the number of random deviates generated by the probability distribution given by argument `sad`. Notice that for some distributions, the value of `S` is deduced from the coefficients, so this value is ignored (see details).
`frac`	single numeric `0 < frac <= 1`; fraction of the community sampled. Usually the proportion of total number of individuals or of total area or volume sampled. Notice that assumptions of Poisson and binomial sampling are only valid for small values of `frac` (see Details).
`sad`	numeric; a vector of positive real numbers depicting abundances of species in a community or sample OR character; root name of community sad distribution - e.g., `lnorm` for the lognormal distribution `rlnorm`; `geom` for the geometric distribution `rgeom` or `ls` for the lognormal distribution `ls`.
`coef`	list with named arguments to be passed to the probability function defined by the argument `sad`.
`trunc`	The truncation point at which the random distribution defined in argument `sad` should be truncated; see `rtrunc`. The default value of `NaN` means that no truncation is performed.
`sampling`	character; if `poisson` the sampling process is Poisson (independent sampling of individuals with replacement); if `nbinom` negative binomial sampling is used to simulate aggregation of individuals in sampling units with replacement; finally, `hypergeometric` samples a fixed number of `frac` times the number of individuals in the simulated community, without replacement. Partial matching is allowed.
`k`	positive; size parameter for the sampling binomial negative. This parameter is ignored for other sampling techniques.
`zeroes`	logical; should zero values be included in the returned vector?
`ssize`	positive integer; sample size: number of draws taken from the community.

Details

This function simulates one or more random samples taken from a community with S species. The expected species abundances in the sampled community can (i) follow a probability distribution given by the argument sad or (ii) be a numeric vector provided by the user through this same argument. A fraction frac of the whole set of units that made up the community (usually individuals) is sampled. Hence the expected abundance in the sample of each species is frac*n, where n is the species' expected abundance in the community.

Three sampling processes can be simulated. Sampling with replacement can be done with Poisson (individuals are sampled independently) or negative binomial sampling (where individuals of each species are aggregated over sampling units). The "hypergeometric" sampling scheme draws frac * n individuals without replacement.

For Poisson and negative binomial schemes the species abundances in the sample are statistically independent. In general terms, these two sampling schemes takes a Poisson or negative binomial sampling with replacement of a vector of S realizations of a random variable, with the sampling intensity given by frac. The resulting values are realizations of a Poisson (or a Negative Binomial) random variable where the parameter that corresponds to the mean (=expected value of the variable) follows a probability distribution or the numeric vector given by the argument sad. Because these two sampling schemes assume replacement but the sampled community is finite, they are valid only when the fraction of the sampled community is small (frac<<1).

The "hypergeometric" scheme simulates a sample of a fixed total number of individuals from the community. Therefore, abundances of the species in the sample are interdependent (Connoly et al. 2009). Sampling is carried out with base::sample(..., replace = FALSE). This scheme samples without replacement a finite community and therefore provides valid results for any value of frac.

For the broken-stick, logseries, MZSM and Volkov distributions, the expected value of S is deduced from the coefficients provided in the argument coef; thus, the value of the parameter S is ignored and may be left blank. The expressions for the number of species in each case are:

* Broken-stick: coefficient S * Log-series: alpha log(1 + N/alpha) * MZSM: sum_x=1^J theta/x (1 - x/J)^(theta - 1) * Volkov: sum of the unnormalized PDF from 1 to J, see dvolkov

Value

if ssize=1 a vector of (zero truncated) abundances in the sample; if ssize>1 a data frame with sample identification, species identification, and (zero truncated) abundances.

Author(s)

Paulo I. Prado prado@ib.usp.br and Andre Chalom.

References

Pielou, E.C. 1977. Mathematical Ecology. New York: John Wiley and Sons.

Green, J. and Plotkin, J.B. 2007 A statistical theory for sampling species abundances. Ecology Letters 10:1037–1045

Connolly, S.R., Dornelas, M., Bellwood, D.R. and Hughes, T.P. 2009. Testing species abundance models: a new bootstrap approach applied to Indo-Pacific coral reefs. Ecology, 90(11): 3138–3149.

Examples

##A Poisson sample from a community with a lognormal sad
samp2 <- rsad(S = 100, frac=0.1, sad="lnorm", coef=list(meanlog=5, sdlog=2))
## Preston plot
plot(octav(samp2))
## Once this is a Poisson sample of a lognormal community, the abundances
## in the sample should follow a Poisson-lognormal distribution.
## Adds line of theoretical Poisson-lognormal with
## mu=meanlog+log(frac) and sigma=sdlog)
## Predicted by the theoretical Poisson-lognormal truncated at zero
samp2.pred <- octavpred(samp2, sad="poilog", coef= list(mu=5+log(0.1), sig=2), trunc=0)
## Adding the line in the Preston plot
lines(samp2.pred)

[Package sads version 0.6.3 Index]