lnre.productivity.measures {zipfR} | R Documentation |
Measures of Productivity and Lexical Richness (zipfR)
Description
Compute expectations of various measures of productivity and lexical richness for a LNRE population.
Usage
lnre.productivity.measures(model, N=NULL, measures, data.frame=TRUE,
bootstrap=FALSE, method="normal", conf.level=.95, sample=NULL,
replicates=1000, parallel=1L, verbose=TRUE, seed=NULL)
Arguments
model |
an object belonging to a subclass of |
measures |
character vector naming the productivity measures to
be computed (see |
N |
an integer vector, specifying the sample size(s) |
data.frame |
if |
bootstrap |
if |
method , conf.level |
type of confidence interval to be estimated by parametric
bootstrapping and the requested confidence level;
see |
sample |
optional callback function to generate bootstrapping samples;
see |
replicates , parallel , seed , verbose |
if |
Details
If bootstrap=FALSE
, expected values of the productivity measures are computed based on the following approximations:
-
V
,TTR
,R
andP
are linear transformations ofV
orV_1
, so expectations can be obtained directly from theEV
andEVm
methods. -
C
,k
,U
andW
are nonlinear transformations ofV
. In this case, the transformation function is approximated by a linear function aroundE[V]
, which is reasonable under typical circumstances. -
Hapax
,S
,alpha2
andH
are based on ratios of two spectrum elements, in some cases with an additional nonlinear transformation. Expectations are based on normal approximations forV
andV_i
together with a generalisation of Díaz-Francés and Rubio's (2013: 313) result on the ratio of two independent normal distributions; for a nonlinear transformation the same linear approximation is made as above. -
K
andD
are (nearly) unbiased estimators of the population coefficient\delta = \sum_{i=1}^{\infty} \pi_i^2
(Simpson 1949: 688).
Approximations used for expected values are explained in detail in Sec. 2.2 of the technical report Inside zipfR.
Value
If bootstrap=FALSE
, a numeric matrix or data frame listing approximate expectations of the selected productivity measures,
with one row for each sample size N
and one column for each measure
. Rows and columns are labelled.
If bootstrap=TRUE
, a numeric matrix or data frame with one column for each productivity measure
and four rows
giving the lower and upper bound of the confidence interval, an estimate of central tendency, and an estimate of spread.
See bootstrap.confint
for details.
Productivity Measures
See productivity.measures
for a list of supported measures with equations and references.
The measures Entropy
and eta
are only supported for bootstrap=TRUE
.
References
Díaz-Francés, Eloísa and Rubio, Francisco J. (2013). On the existence of a normal approximation to the distribution of the ratio of two independent normal random variables. Statistical Papers, 54(2), 309–323.
Simpson, E. H. (1949). Measurement of diversity. Nature, 163, 688.
See Also
productivity.measures
computes productivity measures from observed data sets.
See lnre
for further information on LNRE models, and
lnre.bootstrap
and bootstrap.confint
for details on the bootstrapping procedure.
Examples
## plausible model for an author's vocabulary
model <- lnre("fzm", alpha=0.4, B=0.06, A=1e-12)
## approximate expectation for different sample sizes
lnre.productivity.measures(model, N=c(1000, 10000, 50000))
## estimate sampling distribution: 95% interval, mean, s.d.
## (using parametric bootstrapping, only one sample size at a time)
lnre.productivity.measures(model, N=1000, bootstrap=TRUE)