Tiger {zipfR} | R Documentation |
Tiger NP and PP expansions (zipfR)
Description
Objects of classes tfl
, spc
and
vgc
that contain frequency data for the syntactic
expansions of Noun Phrases (NP) and Prepositional Phrases (PP) in
the Tiger German treebank.
Usage
TigerNP.tfl
TigerNP.spc
TigerNP.emp.vgc
TigerPP.tfl
TigerPP.spc
TigerPP.emp.vgc
Details
In this dataset, types are not words, but syntactic expansions,
i.e., sequences of syntactic categories that form NPs (in
TigerNP
) or PPs (in TigerPP
), according to the Tiger
annotation scheme for German. Thus, for example, among the expansion
types in the TigerNP
dataset, we find ART_NN
and
ART_ADJA_NN
, whereas among the PP expansions in
TigerPP
we find APPR_ART_NN
and APPR_NN
(APPR
is the tag for prepositions in the Tiger tagset).
The Tiger treebank contains about 900,000 tokens (50,000 sentences) of German newspaper text from the Frankfurter Rundschau. The token frequencies of the expansion types are taken from this corpus.
TigerNP.tfl
and TigerPP.tfl
are the type frequency
lists. TigerNP.spc
and TigerPP.spc
are frequency
spectra. TigerNP.emp.vgc
and TigerPP.emp.vgc
are the
corresponding observed vocabulary growth curves (tracking the
development of V
and V(1)
in the original order of
occurrence of the expansion tokens in the source corpus).
References
Tiger Project: https://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger/
Examples
TigerNP.tfl
summary(TigerNP.spc)
summary(TigerNP.emp.vgc)
TigerPP.tfl
summary(TigerPP.spc)
summary(TigerPP.emp.vgc)