R: ML fitting of species abundance distributions

fitsad {sads}

R Documentation

ML fitting of species abundance distributions

Description

Fits probability distributions for abundances of species in a sample or assemblage by maximum likelihood.

Usage

fitsad(x, sad = c("bs","exp", "gamma","geom","lnorm","ls","mzsm","nbinom","pareto",
       "poilog","power", "powbend", "volkov","weibull"), ...)

fitbs(x, trunc, ...)

fitexp(x, trunc = NULL, start.value, ...)

fitgamma(x, trunc, start.value,  ...)

fitgeom(x, trunc = 0, start.value, ...)

fitlnorm(x, trunc, start.value,  ...)

fitls(x, trunc, start.value, upper = length(x), ...)

fitmzsm(x, trunc, start.value, upper = length(x), ...)

fitnbinom(x, trunc = 0, start.value, ...)

fitpareto(x, trunc, start.value,  upper = 20, ...)

fitpoilog(x, trunc = 0, ...)

fitpower(x, trunc, start.value, upper = 20, ...)

fitpowbend(x, trunc, start.value, ...)

fitvolkov(x, trunc, start.value, ...)

fitweibull(x, trunc, start.value,  ...)

Arguments

`x`	vector of (positive integer) quantiles. In the context of SADs, some abundance measurement (e.g., number of individuals, biomass) of species in a sample or ecological assemblage.
`sad`	character; root name of community sad distribution to be fitted. `"bs"` for broken-stick distribution, `"exp"` for exponential distribution, `"gamma"` for gamma distribution, `"geom"` for geometric distributions (not geometric series rad model, `dgs`), `"lnorm"` for lognormal, `"ls"` for Fisher's log-series, `"mzsm"` for Alonso & McKane's neutral metacommunity distribution, `"nbinom"` for negative binomial, `"pareto"` for Pareto distribution, `"poilog"` for Poisson-lognormal distribution, `"power"` for power-law distribution, `"powbend"` for Pueyo's power-bend distribution, `"volkov"` for Volkov's et al. neutral community distribution, `"weibull"` for Weibull distribution.
`trunc`	non-negative integer, `trunc > min(x)`; truncation point to fit a truncated distribution. For "poilog", "geom" and "nbinom" set trunc=NULL to avoid zero-truncation (see details).
`start.value`	named numeric vector; starting values of free parameters to be passed to `mle2`. Parameters should be named as in the corresponding density function, and in the same order.
`upper`	real positive; upper bound for Brent's one-parameter optimization method (default), for fits that use this method by default. See details and `optim`.
`...`	in `fitsad` further arguments to be passed to the specific fitting function (most used are `trunc`, `start.value`) In the specific fitting functions further arguments to be passed to `mle2`.

Details

fitsad is simply a wrapper that calls the specific functions to fit the distribution chosen with the argument sad. Users can interchangeably use fitsad or the individual functions detailed below (e.g. fitsad(x, sad="geom", ...) is the same as fitgeom(x, ...) and so on).

The distributions are fitted by the maximum likelihood method using numerical optimization, with mle2. The resulting object is of fitsad-class which can be handled with mle2 methods for fitted models and has also some additional methods for SADs models (see fitsad-class and examples).

Functions fitexp, fitgamma, fitlnorm and fitweibull fit the standard continuous distributions most used as SADs models. As species with null abundances in the sample are in general unknown, the fit to continuous distributions can be improved by truncation to some value above zero (see example). Convergence problems can occur when fitting with truncation, and can be solved with sensible starting values. fitgamma uses Chapman's (1956) fitting method to find starting values for the truncated gamma distribution, and fitweibull uses Rinne's (2009, p. 413) method (thanks to Mario Jose Marques-Azevedo).

Functions fitgeom, fitnbinom fits geometric and negative binomial distributions which are two discrete standard distributions also used to fit SADs. Since species with zero individuals in the sample are in general unknown, these functions fit by default zero-truncated distributions. To avoid zero-truncation set trunc=NULL. Using the geometric distribution as a SAD model is not to be confounded for fitting the Geometric series fitgs as a rank-abundance distribution (RAD) model.

Function fitls implements the original numerical recipe by Fisher (1943) to fit the log-series distribution, given a vector of species abundances. Alonso et al. (2008, supplementary material) showed that this recipe gives the maximum likelihood estimate of Fisher's alpha, the single parameter of the log-series. Fitting is done through numerical optimization with the uniroot function, following the code of the function fishers.alpha of the untb package. After that, the estimated value of alpha parameter is used as the starting value to get the Log-likelihood from the log-series density function dls, using the function mle2. The total number of individuals in the sample, N, is treated as a fixed parameter in this implementation, in order to maintain coherence with similar parameters from fitvolkov and fitmzsm (see below). Fixed parameters in the model specification do not contribute to the model degrees of freedom, and are not accounted in standard error calculations.

Functions fitpower, fitpowbend and fitpareto fit power-law distributions with one and two-parameters, that have been suggested as SADs models. The implementations of power and power-bend are for discrete distributions that do not include zeroes. The Pareto distribution is continuous and includes all non-negative numbers. Fisher's logseries are a special case of the power-bend, see dpowbend and Pueyo (2006).

Function fitbs fits the Broken-stick distribution (MacArthur 1960). It is defined only by the observed number of elements S in the collection and collection size N. Thus once a sample is taken, the Broken-stick has no free parameters. Therefore, there is no actual fitting, but still fitbs calls mle2 with fixed parameters N and S and eval.only=TRUE to return an object of class fitsad to keep compatibility with other SADs models fitted to the same data. Therefore, the resulting objects allows most of the operations with SAD models, such as comparison with other models through model selection, diagnostic plots and so on (see fitsad-class).

Function fitpoilog fits the Poisson-lognormal distribution. This is a compound distributions that describes the abundances of species in Poisson sample of community that follows a lognormal SAD. This is a sampling model of SAD, which is approximated by the ‘veil line’ truncation of the lognormal (Preston 1948). The Poisson-lognormal is the analytic solution for this sampling model, as Fisher's log-series is a analytic limit case for a Poisson-gamma (a.k.a negative binomial) distribution. As geometric and negative binomial distributions, the Poisson-lognormal includes zero, but the fit is zero-truncated by default, as for fitgeom, fitnbinom. To avoid zero-truncation set trunc=NULL.

fitmzsm fits the metacommunity Zero-sum multinomial distribution dmzsm from the Neutral Theory of Biodiversity (Alonso and McKane 2004). The mZSM describes the SAD of a sample taken from a neutral metacommunity under random drift. It has two parameters, the number of individuals in the sample J and theta, the ‘fundamental biodiversity number’. Because J is known from the sample size, the fit resumes to estimate a single parameter, theta. By default, fitmzsm fits mZSM to a vector of abundances with Brent's one-dimensional method of optimization (see optim). The log-series distribution (Fisher et al. 1943) is a limiting case of mZSM (Hubbel 2001), and theta tends to Fisher's alpha as J increases. In practice the two models provide very similar fits to SADs (see example).

Function fitvolkov fits the SAD model for a community under neutral drift with immigration (Volkov et al. 2003). The model is a stationary distribution deduced from a stochastic process compatible with the Neutral Theory of Biodiversity (Hubbell 2001). It has two free parameters, the ‘fundamental biodiversity number’ theta, and the immigration rate m (see dvolkov) fitvolkov builds on function volkov from package untb to fit Volkov's et al. SAD model to a vector of abundances. The fit can be extremely slow even for vectors of moderate size.

Value

An object of fitsad-class which inherits from mle2-class and thus has methods for handling results of maximum likelihood fits from mle2 and also specific methods to handle SADs models (see fitsad-class).

Author(s)

Paulo I Prado prado@ib.usp.br, Murilo Dantas Miranda and Andre Chalom, after Ben Bolker, R Core Team, Robin Hanking, Vidar Grøtan and Steinar Engen.

Source

all fitting functions builds on mle2 and methods from bbmle package (Bolker 2012), which in turn builds on mle function and associated classes and methods; fitls and fitvolkov use codes and functions from untb package (Hankin 2007); fitpoilog builds on poilog package (Grøtan & Engen 2008).

References

Alonso, D. and McKane, A.J. 2004. Sampling Hubbell's neutral model of biodiversity. Ecology Letters 7:901–910

Alonso, D. and Ostling, A., and Etienne, R.S. 2008 The implicit assumption of symmetry and the species abundance distribution. Ecology Letters, 11: 93-105.

Bolker, B. and R Development Core Team 2012. bbmle: Tools for general maximum likelihood estimation. R package version 1.0.5.2. http://CRAN.R-project.org/package=bbmle

Chapman, D. G. 1956. Estimating the parameters of a truncated gamma distribution. The Annals of Mathematical Statistics, 27(2): 498–506.

Fisher, R.A, Corbert, A.S. and Williams, C.B. (1943) The Relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology, 12(1): 42–58.

Grøtan, V. and Engen, S. 2008. poilog: Poisson lognormal and bivariate Poisson lognormal distribution. R package version 0.4.

Hankin, R.K.S. 2007. Introducing untb, an R Package For Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity. Journal of Statistical Software 22 (12).

Hubbell, S.P. 2001. The Unified Neutral Theory of Biodiversity. Princeton University Press

MacArthur, R.H. 1960. On the relative abundance of species. Am Nat 94:25–36.

Magurran, A.E. 1989. Ecological diversity and its measurement. Princenton University Press.

Preston, F.W. 1948. The commonness and rarity of species. Ecology 29: 254–283.

Pueyo, S. (2006) Diversity: Between neutrality and structure, Oikos 112: 392-405.

Rinne, H. 2009. The Weibull distribution: a handbook. CRC Press

Volkov, I., Banavar, J. R., Hubbell, S. P., Maritan, A. 2003. Neutral theory and relative species abundance in ecology. Nature 424:1035–1037.

Examples


## Magurran (1989) example 5:
## birds in an Australian forest
mag5 <- c(103, 115, 13, 2, 67, 36, 51, 8, 6, 61, 10, 21,
          7, 65, 4, 49, 92, 37, 16, 6, 23, 9, 2, 6, 5, 4, 
          1, 3, 1, 9, 2)
mag5.bs <- fitsad(mag5, "bs") 
summary(mag5.bs)## no estimated coefficient
coef(mag5.bs) ## fixed coefficients N and S
## Diagnostic plots
par(mfrow=c(2, 2))
plot(mag5.bs)
par(mfrow=c(1, 1))


data(moths) #Fisher's moths data
moths.mzsm <- fitmzsm(moths) ## same as fitsad(moths, sad="mzsm")
## fit to log-series
moths.ls <- fitsad(moths, sad="ls")
coef(moths.ls)
coef(moths.mzsm) ## Compare with theta=38.9, Alonso&McKanne (2004)
## Diagnostic plots
par(mfrow=c(2, 2))
plot(moths.mzsm)
par(mfrow=c(1, 1))
## Graphical comparison
plot(rad(moths))
lines(radpred(moths.ls))
lines(radpred(moths.mzsm), col="red", lty=2)
legend("topright", c("log-series", "mZSM"), lty=1, col=c("blue","red"))
## Two more models: truncated lognormal and Poisson-lognormal
moths.ln <- fitsad(moths, "lnorm", trunc=0.5)
moths.pln <- fitsad(moths, "poilog")
## Model selection
AICtab(moths.ln, moths.pln, moths.ls, moths.mzsm, weights=TRUE)

## Biomass as abundance variable
data(ARN82.eB.apr77) #benthonic marine animals
AR.ln <- fitsad(ARN82.eB.apr77, sad="lnorm")
AR.g <- fitsad(ARN82.eB.apr77, sad="gamma")
AR.wb <- fitsad(ARN82.eB.apr77, sad="weibull")
plot(octav(ARN82.eB.apr77))
lines(octavpred(AR.ln))
lines(octavpred(AR.g), col="red")
lines(octavpred(AR.wb), col="green")
legend("topright", c("lognormal", "gamma", "weibull"),lty=1, col=c("blue","red", "green"))
AICctab(AR.ln, AR.g, AR.wb, nobs=length(ARN82.eB.apr77), weights=TRUE)

[Package sads version 0.6.3 Index]