lnre.fzm {zipfR} | R Documentation |
The finite Zipf-Mandelbrot (fZM) LNRE Model (zipfR)
Description
The finite Zipf-Mandelbrot (fZM) LNRE model of Evert (2004).
The constructor function lnre.fzm
is not user-visible. It is
invoked implicitly when lnre
is called with LNRE model type
"fzm"
.
Usage
lnre.fzm(alpha=.8, A=1e-9, B=.01, param=list())
## user call: lnre("fzm", spc=spc) or lnre("fzm", alpha=.8, A=1e-9, B=.01)
Arguments
alpha |
the shape parameter |
A |
the lower cutoff parameter |
B |
the upper cutoff parameter |
param |
a list of parameters given as name-value pairs (alternative method of parameter specification) |
Details
The parameters of the fZM model can either be specified as immediate arguments:
lnre.fzm(alpha=.5, A=5e-12, B=.1)
or as a list of name-value pairs:
lnre.fzm(param=list(alpha=.5, A=5e-12, B=.1))
which is usually more convenient when the constructor is invoked by
another function (such as lnre
). If both immediate arguments
and the param
list are given, the immediate arguments override
conflicting values in param
. For any parameters that are
neither specified as immediate arguments nor listed in param
,
the defaults from the function prototype are inserted.
The lnre.fzm
constructor also checks the types and ranges of
parameter values and aborts with an error message if an invalid
parameter is detected.
NB: parameter estimation is faster and more robust for the
inexact fZM model, so you might consider passing the
exact=FALSE
option to lnre
unless you intend to make
predictions for small sample sizes N
and/or high spectrum elements
E[V_m(N)]
(m \gg 1
) with the model.
Value
A partially initialized object of class lnre.fzm
, which is
completed and passed back to the user by the lnre function.
See lnre
for a detailed description of lnre.fzm
objects (as a subclass of lnre
).
Mathematical Details
Similar to ZM, the fZM model is a LNRE re-formulation of the
Zipf-Mandelbrot law for a population with a finite vocabulary
size S
, i.e.
\pi_k = \frac{C}{(k + b) ^ a}
for k = 1, \ldots, S
. The parameters of the Zipf-Mandelbrot law
are a > 1
, b \ge 0
and S
(see also Baayen 2001,
101ff). The fZM model is given by the type density function
g(\pi) := C\cdot \pi^{-\alpha-1}
for A \le \pi \le B
(and \pi = 0
otherwise), and has three
parameters 0 < \alpha < 1
and 0 < A < B \le 1
. The
normalizing constant is
C = \frac{ 1 - \alpha }{ B^{1 - \alpha} - A^{1 - \alpha} }
and the population vocabulary size is
S = \frac{1 - \alpha}{\alpha} \cdot
\frac{ A^{-\alpha} - B^{-\alpha} }{ B^{1 - \alpha} - A^{1 - \alpha} }
See Evert (2004) and the lnre.zm
manpage for further
details.
References
Baayen, R. Harald (2001). Word Frequency Distributions. Kluwer, Dordrecht.
Evert, Stefan (2004). A simple LNRE model for random character sequences. Proceedings of JADT 2004, 411-422.
See Also
lnre
for pointers to relevant methods and functions for
objects of class lnre
, as well as a complete listing of LNRE
models implemented in the zipfR
library.