spc {zipfR} | R Documentation |
Frequency Spectra (zipfR)
Description
In the zipfR
library, spc
objects are used to represent
a word frequency spectrum (either an observed spectrum or the expected
spectrum of a LNRE model at a given sample size).
With the spc
constructor function, an object can be initialized
directly from the specified data vectors. It is more common to read
an observed spectrum from a disk file with read.spc
or
compute an expected spectrum with lnre.spc
, though.
spc
objects should always be treated as read-only.
Usage
spc(Vm, m=1:length(Vm), VVm=NULL, N=NA, V=NA, VV=NA,
m.max=0, expected=!missing(VVm))
Arguments
m |
integer vector of frequency classes |
Vm |
vector of corresponding class sizes |
VVm |
optional vector of estimated variances
|
N , V |
total sample size |
VV |
variance |
m.max |
highest frequency class |
expected |
set to |
Details
A spc
object is a data frame with the following variables:
m
frequency class
m
, an integer vectorVm
class size, i.e. number
V_m
of types in frequency classm
(either observed class size from a sample or expected class sizeE[V_m]
based on a LNRE model)VVm
optional: estimated variance
V[V_m]
of expected class size (only meaningful for expected spectrum derived from LNRE model)
The following attributes are used to store additional information about the frequency spectrum:
m.max
if non-zero, the frequency spectrum is incomplete and lists only frequency classes up to
m.max
N, V
sample size
N
and vocabulary sizeV
of the frequency spectrum. For a complete frequency spectrum, these values could easily be determined fromm
andVm
, but they are essential for an incomplete spectrum.VV
variance of expected vocabulary size; only present if
hasVariances
isTRUE
. Note thatVV
may have the valueNA
is the user failed to specify it.expected
if
TRUE
, frequency spectrum lists expected class sizesE[V_m]
(rather than observed sizesV_m
). Note that theVVm
variable is only allowed for an expected frequency spectrum.hasVariances
indicates whether or not the
VVm
variable is present
Value
An object of class spc
representing the specified frequency
spectrum. This object should be treated as read-only (although such
behaviour cannot be enforced in R).
See Also
read.spc
, write.spc
,
spc.vector
, sample.spc
,
spc2tfl
, tfl2spc
,
lnre.spc
, plot.spc
Generic methods supported by spc
objects are
print
, summary
, N
,
V
, Vm
, VV
, and
VVm
.
Implementation details and non-standard arguments for these methods
can be found on the manpages print.spc
,
summary.spc
, N.spc
, V.spc
,
etc.
Examples
## load Brown imaginative prose spectrum and inspect it
data(BrownImag.spc)
summary(BrownImag.spc)
print(BrownImag.spc)
plot(BrownImag.spc)
N(BrownImag.spc)
V(BrownImag.spc)
Vm(BrownImag.spc,1)
Vm(BrownImag.spc,1:5)
## compute ZM model, and generate PARTIAL expected spectrum
## with variances for a sample of 10 million tokens
zm <- lnre("zm",BrownImag.spc)
zm.spc <- lnre.spc(zm,1e+7,variances=TRUE)
## inspect extrapolated spectrum
summary(zm.spc)
print(zm.spc)
plot(zm.spc,log="x")
N(zm.spc)
V(zm.spc)
VV(zm.spc)
Vm(zm.spc,1)
VVm(zm.spc,1)
## generate an artificial Zipfian-looking spectrum
## and take a look at it
zipf.spc <- spc(round(1000/(1:1000)^2))
summary(zipf.spc)
plot(zipf.spc)
## see manpages of lnre, and the various *.spc mapages
## for more examples of spc usage