fitsad {sads}  R Documentation 
Fits probability distributions for abundances of species in a sample or assemblage by maximum likelihood.
fitsad(x, sad = c("bs","gamma","geom","lnorm","ls","mzsm","nbinom","pareto", "poilog","power", "powbend", "volkov","weibull"), ...) fitbs(x, trunc, ...) fitgamma(x, trunc, start.value, ...) fitgeom(x, trunc = 0, start.value, ...) fitlnorm(x, trunc, start.value, ...) fitls(x, trunc, start.value, upper = length(x), ...) fitmzsm(x, trunc, start.value, upper = length(x), ...) fitnbinom(x, trunc=0, start.value, ...) fitpareto(x, trunc, start.value, upper = 20, ...) fitpoilog(x, trunc = 0, ...) fitpower(x, trunc, start.value, upper = 20, ...) fitpowbend(x, trunc, start.value, ...) fitvolkov(x, trunc, start.value, ...) fitweibull(x, trunc, start.value, ...)
x 
vector of (positive integer) quantiles. In the context of SADs, some abundance measurement (e.g., number of individuals, biomass) of species in a sample or ecological assemblage. 
sad 
character; root name of community sad distribution to be fitted.
"gamma" for gamma distribution,
"geom" for geometric distributions (not geometric series rad model,

trunc 
nonnegative integer, 
start.value 
named numeric vector; starting values of free parameters to be
passed to 
upper 
real positive; upper bound for Brent's oneparameter optimization
method (default), for fits that use this method by default. See
details and 
... 
in 
fitsad
is simply a wrapper that calls the specific functions to fit
the distribution chosen with the argument sad
. Users
can interchangeably use fitsad
or the individual functions
detailed below
(e.g. fitsad(x, sad="geom", ...)
is the same as
fitgeom(x, ...)
and so on).
The distributions are fitted by the
maximum likelihood method using numerical optimization,
with mle2
.
The resulting object is of fitsadclass
which can be handled with mle2
methods
for fitted models and has also some additional
methods for SADs models (see
fitsadclass
and examples).
Functions fitgamma
, fitlnorm
and fitweibull
fit the
standard continuous distributions most used as SADs models.
As species with null abundances in the sample are in general unknown, the
fit to continuous distributions can be improved by truncation to some value
above zero (see example). Convergence problems can occur when fitting
with truncation, and can be solved with sensible starting values.
fitgamma
uses Chapman's (1956) fitting method to find starting
values for the truncated gamma distribution, and fitweibull
uses Rinne's (2009, p. 413) method (thanks to Mario Jose MarquesAzevedo).
Functions fitgeom
, fitnbinom
fits geometric and negative
binomial distributions which are two discrete
standard distributions also used to fit SADs. Since species
with zero individuals in the sample are in general unknown,
these functions fit by default zerotruncated distributions.
To avoid zerotruncation set trunc=NULL
.
Using the geometric distribution as a SAD model is not to be
confounded for fitting the Geometric series fitgs
as a rankabundance distribution (RAD) model.
Function fitls
implements the original numerical recipe by Fisher (1943) to
fit the logseries distribution, given a vector of species abundances.
Alonso et al. (2008,
supplementary material) showed that this recipe gives the maximum
likelihood estimate of Fisher's alpha, the single parameter of the logseries.
Fitting is done through numerical optimization with the uniroot
function, following the code of the function fishers.alpha
of the
untb package. After that, the estimated value of alpha parameter is
used as the starting value to get the Loglikelihood from the logseries density
function dls
, using the function mle2
.
The total number of individuals in the sample, N
, is treated as a fixed
parameter in this implementation, in order to maintain coherence with
similar parameters from fitvolkov
and fitmzsm
(see below).
Fixed parameters in the model specification do not contribute to the model
degrees of freedom, and are not accounted in standard error calculations.
Functions fitpower
, fitpowbend
and fitpareto
fit powerlaw distributions
with one and twoparameters, that have been suggested as SADs models.
The implementations of power and powerbend are for discrete
distributions that do not include zeroes. The Pareto distribution is
continuous and includes all nonnegative numbers.
Fisher's logseries are a special case of the
powerbend, see dpowbend
and Pueyo (2006).
Function fitbs
fits the Brokenstick distribution
(MacArthur 1960). It is defined only by the observed number of
elements S
in the collection and collection size N
.
Thus once a sample is taken,
the Brokenstick has no free parameters.
Therefore, there is no actual fitting, but still
fitbs
calls
mle2
with
fixed parameters N
and S
and eval.only=TRUE
to return an object of class fitsad
to keep compatibility with other
SADs models fitted to the same data.
Therefore, the resulting objects allows most of the
operations with SAD models, such as
comparison with other models through model selection,
diagnostic plots and so on
(see fitsadclass
).
Function fitpoilog
fits the Poissonlognormal distribution.
This is a compound distributions that describes the abundances
of species in Poisson sample of community that follows a
lognormal SAD. This is a sampling model of SAD, which is
approximated by the ‘veil line’ truncation of the lognormal
(Preston 1948). The Poissonlognormal is the analytic
solution for this sampling model, as Fisher's logseries
is a analytic limit case for a Poissongamma
(a.k.a negative binomial) distribution. As geometric and
negative binomial distributions, the Poissonlognormal
includes zero, but the fit is zerotruncated
by default, as for fitgeom
, fitnbinom
.
To avoid zerotruncation set trunc=NULL
.
fitmzsm
fits the metacommunity Zerosum multinomial distribution
dmzsm
from the Neutral Theory of Biodiversity
(Alonso and McKane 2004). The mZSM describes the SAD of a sample
taken from a neutral metacommunity under random drift.
It has two parameters,
the number of individuals in the sample J
and theta
,
the ‘fundamental biodiversity number’.
Because J
is known from the sample size,
the fit resumes to estimate a single
parameter, theta
. By default, fitmzsm
fits mZSM to a vector of abundances
with Brent's onedimensional method of optimization (see
optim
). The logseries distribution (Fisher et al. 1943)
is a limiting case of mZSM (Hubbel 2001), and theta
tends to
Fisher's alpha as J
increases. In practice
the two models provide very similar fits to SADs (see example).
Function fitvolkov
fits the SAD model for a community
under neutral drift with immigration
(Volkov et al. 2003).
The model is a stationary distribution deduced from
a stochastic process compatible with the Neutral Theory
of Biodiversity (Hubbell 2001). It
has two
free parameters, the ‘fundamental biodiversity number’ theta
, and the
immigration rate m
(see dvolkov
)
fitvolkov
builds on function volkov
from package
untb to
fit Volkov's et al. SAD model to a vector of abundances.
The fit can be extremely slow even for vectors
of moderate size.
An object of fitsadclass
which inherits from mle2class
and thus has methods for handling
results of maximum likelihood fits from mle2
and also specific methods to handle SADs models
(see fitsadclass
).
Paulo I Prado prado@ib.usp.br and Murilo Dantas Miranda, after Ben Bolker, R Core Team, Robin Hanking, Vidar Grøtan and Steinar Engen.
all fitting functions builds on mle2
and methods
from bbmle package (Bolker 2012), which in turn builds on
mle
function and associated classes and methods;
fitls
and fitvolkov
use codes and functions from
untb package (Hankin 2007); fitpoilog
builds on
poilog package (Grøtan & Engen 2008).
Alonso, D. and McKane, A.J. 2004. Sampling Hubbell's neutral model of biodiversity. Ecology Letters 7:901–910
Alonso, D. and Ostling, A., and Etienne, R.S. 2008 The implicit assumption of symmetry and the species abundance distribution. Ecology Letters, 11: 93105.
Bolker, B. and R Development Core Team 2012. bbmle: Tools for general maximum likelihood estimation. R package version 1.0.5.2. http://CRAN.Rproject.org/package=bbmle
Chapman, D. G. 1956. Estimating the parameters of a truncated gamma distribution. The Annals of Mathematical Statistics, 27(2): 498–506.
Fisher, R.A, Corbert, A.S. and Williams, C.B. (1943) The Relation between the number of species and the number of individuals in a random sample of an animal population. The Journal of Animal Ecology, 12(1): 42–58.
Grøtan, V. and Engen, S. 2008. poilog: Poisson lognormal and bivariate Poisson lognormal distribution. R package version 0.4.
Hankin, R.K.S. 2007. Introducing untb, an R Package For Simulating Ecological Drift Under the Unified Neutral Theory of Biodiversity. Journal of Statistical Software 22 (12).
Hubbell, S.P. 2001. The Unified Neutral Theory of Biodiversity. Princeton University Press
MacArthur, R.H. 1960. On the relative abundance of species. Am Nat 94:25–36.
Magurran, A.E. 1989. Ecological diversity and its measurement. Princenton University Press.
Preston, F.W. 1948. The commonness and rarity of species. Ecology 29: 254–283.
Pueyo, S. (2006) Diversity: Between neutrality and structure, Oikos 112: 392405.
Rinne, H. 2009. The Weibull distribution: a handbook. CRC Press
Volkov, I., Banavar, J. R., Hubbell, S. P., Maritan, A. 2003. Neutral theory and relative species abundance in ecology. Nature 424:1035–1037.
dls
, dmzsm
, dpareto
,
dpoilog
, dpower
, dvolkov
, dpowbend
for corresponding density functions created for fitting SADs;
standard distributions dweibull
,
dgamma
, dgeom
, dlnorm
, dnbinom
;
fitsadclass
.
## Magurran (1989) example 5: ## birds in an Australian forest mag5 < c(103, 115, 13, 2, 67, 36, 51, 8, 6, 61, 10, 21, 7, 65, 4, 49, 92, 37, 16, 6, 23, 9, 2, 6, 5, 4, 1, 3, 1, 9, 2) mag5.bs < fitsad(mag5, "bs") summary(mag5.bs)## no estimated coefficient coef(mag5.bs) ## fixed coefficients N and S ## Diagnostic plots par(mfrow=c(2, 2)) plot(mag5.bs) par(mfrow=c(1, 1)) data(moths) #Fisher's moths data moths.mzsm < fitmzsm(moths) ## same as fitsad(moths, sad="mzsm") ## fit to logseries moths.ls < fitsad(moths, sad="ls") coef(moths.ls) coef(moths.mzsm) ## Compare with theta=38.9, Alonso&McKanne (2004) ## Diagnostic plots par(mfrow=c(2, 2)) plot(moths.mzsm) par(mfrow=c(1, 1)) ## Graphical comparison plot(rad(moths)) lines(radpred(moths.ls)) lines(radpred(moths.mzsm), col="red", lty=2) legend("topright", c("logseries", "mZSM"), lty=1, col=c("blue","red")) ## Two more models: truncated lognormal and Poissonlognormal moths.ln < fitsad(moths, "lnorm", trunc=0.5) moths.pln < fitsad(moths, "poilog") ## Model selection AICtab(moths.ln, moths.pln, moths.ls, moths.mzsm, weights=TRUE) ## Biomass as abundance variable data(ARN82.eB.apr77) #benthonic marine animals AR.ln < fitsad(ARN82.eB.apr77, sad="lnorm") AR.g < fitsad(ARN82.eB.apr77, sad="gamma") AR.wb < fitsad(ARN82.eB.apr77, sad="weibull") plot(octav(ARN82.eB.apr77)) lines(octavpred(AR.ln)) lines(octavpred(AR.g), col="red") lines(octavpred(AR.wb), col="green") legend("topright", c("lognormal", "gamma", "weibull"),lty=1, col=c("blue","red", "green")) AICctab(AR.ln, AR.g, AR.wb, nobs=length(ARN82.eB.apr77), weights=TRUE)