| read.vgc {zipfR} | R Documentation |
Loading and Saving Vocabulary Growth Curves (zipfR)
Description
read.vgc loads vocabulary growth data from .vgc file
write.vgc saves vocabulary growth data in .vgc file
Usage
read.vgc(file)
write.vgc(vgc, file)
Arguments
file |
character string specifying the pathname of a disk file.
Files with extension |
vgc |
a vocabulary growth curve, i.e.\ an object of class
|
Format
A TAB-delimited text file with column headers but no row names
(suitable for reading with read.delim). The file must contain
at least the following two columns:
Nincreasing integer vector of sample sizes
NVcorresponding observed vocabulary sizes
V(N)or expected vocabulary sizesE[V(N)]
Optionally, columns V1, ..., V9 can be added to
specify the number of hapaxes (V_1(N)), dis legomena
(V_2(N)), and further spectrum elements up to V_9(N).
It is not necessary to include all 9 columns, but for any V_m(N)
in the data set, all "lower" spectrum elements V_{m'}(N) (for
m' < m) must also be present. For example, it is valid to have
columns V1 V2 V3, but not V1 V3 V5 or V2 V3 V4.
Variances for expected vocabulary sizes and spectrum elements can be
given in further columns VV (for
\mathop{Var}[V(N)]), and VV1, ...,
VV9 (for \mathop{Var}[V_m(N)]). VV
is mandatory in this case, and columns VVm must be specified
for exactly the same frequency classes m as the Vm
above.
These columns may appear in any order in the text file. All other columns will be silently ignored.
Details
If the filename file ends in the extension .gz, .bz2 or .xz,
the disk file will automatically be decompressed (read.vgc) or compressed (write.vgc).
Value
read.vgc returns an object of class vgc (see the
vgc manpage for details)
See Also
See the vgc manpage for details on vgc objects.
See read.tfl and read.spc for
import/export of other data structures.
Examples
## save Italian ultra- prefix VGC to external text file
fname <- tempfile(fileext=".vgc")
write.vgc(ItaUltra.emp.vgc, fname)
## now <fname> is a TAB-delimited text file with columns N, V and V1
## we ready it back in
New.vgc <- read.vgc(fname)
## same vgc as ItaUltra.emp.vgc, compare:
summary(New.vgc)
summary(ItaUltra.emp.vgc)
head(New.vgc)
head(ItaUltra.emp.vgc)
stopifnot(isTRUE(all.equal(New.vgc, ItaUltra.emp.vgc))) # should be identical