## ML fitting of species abundance distributions

### Description

Fits probability distributions for abundances of species in a sample or assemblage by maximum likelihood.

### Usage

```fitsad(x, sad = c("bs","gamma","geom","lnorm","ls","mzsm","nbinom","pareto",
"poilog","power", "powbend", "volkov","weibull"), ...)

fitbs(x, trunc, ...)

fitgamma(x, trunc, start.value,  ...)

fitgeom(x, trunc = 0, start.value, ...)

fitlnorm(x, trunc, start.value,  ...)

fitls(x, trunc, start.value, upper = length(x), ...)

fitmzsm(x, trunc, start.value, upper = length(x), ...)

fitnbinom(x, trunc=0, start.value, ...)

fitpareto(x, trunc, start.value,  upper = 20, ...)

fitpoilog(x, trunc = 0, ...)

fitpower(x, trunc, start.value, upper = 20, ...)

fitpowbend(x, trunc, start.value, ...)

fitvolkov(x, trunc, start.value, ...)

fitweibull(x, trunc, start.value,  ...)
```

### Arguments

 `x` vector of (positive integer) quantiles. In the context of SADs, some abundance measurement (e.g., number of individuals, biomass) of species in a sample or ecological assemblage. `sad` character; root name of community sad distribution to be fitted. "gamma" for gamma distribution, "geom" for geometric distributions (not geometric series rad model, `dgs`), "lnorm" for lognormal, "ls" for Fisher's log-series, "mzsm" for Alonso & McKane's neutral metacommunity distribution, "nbinom" for negative binomial, "pareto" for Pareto distribution, "poilog" for Poisson-lognormal distribution, "power" for power-law distribution, "powbend" for Pueyo's power-bend distribution, "volkov" for Volkov's et al. neutral community distribution, "weibull" for Weibull distribution. `trunc` non-negative integer, `trunc > min(x)`; truncation point to fit a truncated distribution. For "poilog", "geom" and "nbinom" set trunc=NULL to avoid zero-truncation (see details). `start.value` named numeric vector; starting values of free parameters to be passed to `mle2`. Parameters should be named as in the corresponding density function, and in the same order. `upper` real positive; upper bound for Brent's one-parameter optimization method (default), for fits that use this method by default. See details and `optim`. `...` in `fitsad` further arguments to be passed to the specific fitting function (most used are `trunc`, `start.value`) In the specific fitting functions further arguments to be passed to `mle2`.

### Details

`fitsad` is simply a wrapper that calls the specific functions to fit the distribution chosen with the argument `sad`. Users can interchangeably use `fitsad` or the individual functions detailed below (e.g. `fitsad(x, sad="geom", ...)` is the same as `fitgeom(x, ...)` and so on).

The distributions are fitted by the maximum likelihood method using numerical optimization, with `mle2`. The resulting object is of `fitsad-class` which can be handled with `mle2` methods for fitted models and has also some additional methods for SADs models (see `fitsad-class` and examples).

Functions `fitgamma`, `fitlnorm` and `fitweibull` fit the standard continuous distributions most used as SADs models. As species with null abundances in the sample are in general unknown, the fit to continuous distributions can be improved by truncation to some value above zero (see example). Convergence problems can occur when fitting with truncation, and can be solved with sensible starting values. `fitgamma` uses Chapman's (1956) fitting method to find starting values for the truncated gamma distribution, and `fitweibull` uses Rinne's (2009, p. 413) method (thanks to Mario Jose Marques-Azevedo).

Functions `fitgeom`, `fitnbinom` fits geometric and negative binomial distributions which are two discrete standard distributions also used to fit SADs. Since species with zero individuals in the sample are in general unknown, these functions fit by default zero-truncated distributions. To avoid zero-truncation set `trunc=NULL`. Using the geometric distribution as a SAD model is not to be confounded for fitting the Geometric series `fitgs` as a rank-abundance distribution (RAD) model.

Function `fitls` implements the original numerical recipe by Fisher (1943) to fit the log-series distribution, given a vector of species abundances. Alonso et al. (2008, supplementary material) showed that this recipe gives the maximum likelihood estimate of Fisher's alpha, the single parameter of the log-series. Fitting is done through numerical optimization with the `uniroot` function, following the code of the function `fishers.alpha` of the untb package. After that, the estimated value of alpha parameter is used as the starting value to get the Log-likelihood from the log-series density function `dls`, using the function `mle2`. The total number of individuals in the sample, `N`, is treated as a fixed parameter in this implementation, in order to maintain coherence with similar parameters from `fitvolkov` and `fitmzsm` (see below). Fixed parameters in the model specification do not contribute to the model degrees of freedom, and are not accounted in standard error calculations.

Functions `fitpower`, `fitpowbend` and `fitpareto` fit power-law distributions with one and two-parameters, that have been suggested as SADs models. The implementations of power and power-bend are for discrete distributions that do not include zeroes. The Pareto distribution is continuous and includes all non-negative numbers. Fisher's logseries are a special case of the power-bend, see `dpowbend` and Pueyo (2006).

Function `fitbs` fits the Broken-stick distribution (MacArthur 1960). It is defined only by the observed number of elements `S` in the collection and collection size `N`. Thus once a sample is taken, the Broken-stick has no free parameters. Therefore, there is no actual fitting, but still `fitbs` calls `mle2` with fixed parameters `N` and `S` and `eval.only=TRUE` to return an object of class `fitsad` to keep compatibility with other SADs models fitted to the same data. Therefore, the resulting objects allows most of the operations with SAD models, such as comparison with other models through model selection, diagnostic plots and so on (see `fitsad-class`).

Function `fitpoilog` fits the Poisson-lognormal distribution. This is a compound distributions that describes the abundances of species in Poisson sample of community that follows a lognormal SAD. This is a sampling model of SAD, which is approximated by the ‘veil line’ truncation of the lognormal (Preston 1948). The Poisson-lognormal is the analytic solution for this sampling model, as Fisher's log-series is a analytic limit case for a Poisson-gamma (a.k.a negative binomial) distribution. As geometric and negative binomial distributions, the Poisson-lognormal includes zero, but the fit is zero-truncated by default, as for `fitgeom`, `fitnbinom`. To avoid zero-truncation set `trunc=NULL`.

`fitmzsm` fits the metacommunity Zero-sum multinomial distribution `dmzsm` from the Neutral Theory of Biodiversity (Alonso and McKane 2004). The mZSM describes the SAD of a sample taken from a neutral metacommunity under random drift. It has two parameters, the number of individuals in the sample `J` and `theta`, the ‘fundamental biodiversity number’. Because `J` is known from the sample size, the fit resumes to estimate a single parameter, `theta`. By default, `fitmzsm` fits mZSM to a vector of abundances with Brent's one-dimensional method of optimization (see `optim`). The log-series distribution (Fisher et al. 1943) is a limiting case of mZSM (Hubbel 2001), and `theta` tends to Fisher's alpha as `J` increases. In practice the two models provide very similar fits to SADs (see example).

Function `fitvolkov` fits the SAD model for a community under neutral drift with immigration (Volkov et al. 2003). The model is a stationary distribution deduced from a stochastic process compatible with the Neutral Theory of Biodiversity (Hubbell 2001). It has two free parameters, the ‘fundamental biodiversity number’ `theta`, and the immigration rate `m` (see `dvolkov`) `fitvolkov` builds on function `volkov` from package untb to fit Volkov's et al. SAD model to a vector of abundances. The fit can be extremely slow even for vectors of moderate size.

### Value

An object of `fitsad-class` which inherits from `mle2-class` and thus has methods for handling results of maximum likelihood fits from `mle2` and also specific methods to handle SADs models (see `fitsad-class`).

### Author(s)

Paulo I Prado prado@ib.usp.br and Murilo Dantas Miranda, after Ben Bolker, R Core Team, Robin Hanking, Vidar Grøtan and Steinar Engen.

### Source

all fitting functions builds on `mle2` and methods from bbmle package (Bolker 2012), which in turn builds on `mle` function and associated classes and methods; `fitls` and `fitvolkov` use codes and functions from untb package (Hankin 2007); `fitpoilog` builds on poilog package (Grøtan & Engen 2008).

### References

Alonso, D. and McKane, A.J. 2004. Sampling Hubbell's neutral model of biodiversity. Ecology Letters 7:901–910

Alonso, D. and Ostling, A., and Etienne, R.S. 2008 The implicit assumption of symmetry and the species abundance distribution. Ecology Letters, 11: 93-105.

Bolker, B. and R Development Core Team 2012. bbmle: Tools for general maximum likelihood estimation. R package version 1.0.5.2. http://CRAN.R-project.org/package=bbmle

Chapman, D. G. 1956. Estimating the parameters of a truncated gamma distribution. The Annals of Mathematical Statistics, 27(2): 498–506.

Fisher, R.A, Corbert, A.S. and Williams, C.B. (1943) The Relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology, 12(1): 42–58.

Grøtan, V. and Engen, S. 2008. poilog: Poisson lognormal and bivariate Poisson lognormal distribution. R package version 0.4.

Hankin, R.K.S. 2007. Introducing untb, an R Package For Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity. Journal of Statistical Software 22 (12).

Hubbell, S.P. 2001. The Unified Neutral Theory of Biodiversity. Princeton University Press

MacArthur, R.H. 1960. On the relative abundance of species. Am Nat 94:25–36.

Magurran, A.E. 1989. Ecological diversity and its measurement. Princenton University Press.

Preston, F.W. 1948. The commonness and rarity of species. Ecology 29: 254–283.

Pueyo, S. (2006) Diversity: Between neutrality and structure, Oikos 112: 392-405.

Rinne, H. 2009. The Weibull distribution: a handbook. CRC Press

Volkov, I., Banavar, J. R., Hubbell, S. P., Maritan, A. 2003. Neutral theory and relative species abundance in ecology. Nature 424:1035–1037.

`dls`, `dmzsm`, `dpareto`, `dpoilog`, `dpower`,` dvolkov`, `dpowbend` for corresponding density functions created for fitting SADs; standard distributions `dweibull`, `dgamma`, `dgeom`, `dlnorm`, `dnbinom`; `fitsad-class`.

### Examples

```
## Magurran (1989) example 5:
## birds in an Australian forest
mag5 <- c(103, 115, 13, 2, 67, 36, 51, 8, 6, 61, 10, 21,
7, 65, 4, 49, 92, 37, 16, 6, 23, 9, 2, 6, 5, 4,
1, 3, 1, 9, 2)
summary(mag5.bs)## no estimated coefficient
coef(mag5.bs) ## fixed coefficients N and S
## Diagnostic plots
par(mfrow=c(2, 2))
plot(mag5.bs)
par(mfrow=c(1, 1))

data(moths) #Fisher's moths data
## fit to log-series
coef(moths.ls)
coef(moths.mzsm) ## Compare with theta=38.9, Alonso&McKanne (2004)
## Diagnostic plots
par(mfrow=c(2, 2))
plot(moths.mzsm)
par(mfrow=c(1, 1))
## Graphical comparison
legend("topright", c("log-series", "mZSM"), lty=1, col=c("blue","red"))
## Two more models: truncated lognormal and Poisson-lognormal
## Model selection
AICtab(moths.ln, moths.pln, moths.ls, moths.mzsm, weights=TRUE)

## Biomass as abundance variable
data(ARN82.eB.apr77) #benthonic marine animals