| kRp.text,-class {koRpus} | R Documentation |
S4 Class kRp.text
Description
This class is used for objects that are returned by treetag or tokenize.
Slots
langA character string, naming the language that is assumed for the tokenized text in this object.
descDescriptive statistics of the tagged text.
tokensResults of the called tokenizer and POS tagger. The data.frame usually has eleven columns:
doc_id:Factor, optional document identifier.
token:Character, the tokenized text.
tag:Factor, POS tags for each token.
lemma:Character, lemma for each token.
lttr:Integer, number of letters.
wclass:Factor, word class.
desc:Factor, a short description of the POS tag.
stop:Logical,
TRUEif token is a stopword.stem:Character, stemmed token.
idx:Integer, index number of token in this document.
sntc:Integer, number of sentence in this document.
This data.frame structure adheres to the "Text Interchange Formats" guidelines set out by rOpenSci[1].
featuresA named logical vector, indicating which features are available in this object's
feat_listslot. Common features are listed in the description of thefeat_listslot.feat_listA named list with optional analysis results or other content as used by the defined
features:hyphenA named list of objects of classkRp.hyphen.readabilityA named list of objects of classkRp.readability.lex_divA named list of objects of classkRp.TTR.freqA list with additional results offreq.analysis.corp_freqAn object of classkRp.corp.freq, e.g., results of a call toread.corp.custom.diffAdditional results of calls to a method liketextTransform.doc_term_matrixA sparse document-term matrix, as produced bydocTermMatrix.
See the
getter and setter methodsfor easy access to these sub-slots. There can actually be any number of additional features, the above is just a list of those already defined by this package.
Contructor function
Should you need to manually generate objects of this class (which should rarely be the case),
the contructor function
kRp_text(...) can be used instead of
new("kRp.text", ...).
Note
There is also as() methods to transform objects from other koRpus classes into kRp.text.
References
[1] Text Interchange Formats (https://github.com/ropensci/tif)