zipf {VGAM} | R Documentation |
Zipf Distribution Family Function
Description
Estimates the parameter of the Zipf distribution.
Usage
zipf(N = NULL, lshape = "loglink", ishape = NULL)
Arguments
N |
Number of elements, an integer satisfying |
lshape |
Parameter link function applied to the (positive) shape parameter |
ishape |
Optional initial value for the parameter |
Details
The probability function for a response Y
is
P(Y=y) = y^{-s} / \sum_{i=1}^N i^{-s},\ \ s>0,\ \ y=1,2,\ldots,N,
where s
is the exponent characterizing the distribution.
The mean of Y
, which are returned as the fitted values,
is \mu = H_{N,s-1} / H_{N,s}
where H_{n,m}= \sum_{i=1}^n i^{-m}
is the n
th generalized harmonic number.
Zipf's law is an experimental law which is often applied
to the study of the frequency of words in a corpus of
natural language utterances. It states that the frequency
of any word is inversely proportional to its rank in the
frequency table. For example, "the"
and "of"
are first two most common words, and Zipf's law states
that "the"
is twice as common as "of"
.
Many other natural phenomena conform to Zipf's law.
Value
An object of class "vglmff"
(see vglmff-class
).
The object is used by modelling functions such as
vglm
and vgam
.
Note
Upon convergence, the N
is stored as @misc$N
.
Author(s)
T. W. Yee
References
pp.526– of Chapter 11 of Johnson N. L., Kemp, A. W. and Kotz S. (2005). Univariate Discrete Distributions, 3rd edition, Hoboken, New Jersey, USA: Wiley.
See Also
Examples
zdata <- data.frame(y = 1:5, ofreq = c(63, 14, 5, 1, 2))
zfit <- vglm(y ~ 1, zipf, data = zdata, trace = TRUE, weight = ofreq)
zfit <- vglm(y ~ 1, zipf(lshape = "identitylink", ishape = 3.4), data = zdata,
trace = TRUE, weight = ofreq, crit = "coef")
zfit@misc$N
(shape.hat <- Coef(zfit))
with(zdata, weighted.mean(y, ofreq))
fitted(zfit, matrix = FALSE)