kRp.POS.tags {koRpus} | R Documentation |
Get elaborated word tag definitions
Description
This function can be used to get a set of part-of-speech (POS) tags for a given language. These tag sets should conform with the ones used by TreeTagger.
Usage
kRp.POS.tags(
lang = get.kRp.env(lang = TRUE),
list.classes = FALSE,
list.tags = FALSE,
tags = c("words", "punct", "sentc")
)
Arguments
lang |
A character string defining a language (see details for valid choices). |
list.classes |
Logical,
if |
list.tags |
Logical,
if |
tags |
A character vector with at least one of "words", "punct" or "sentc". |
Details
Use available.koRpus.lang
to get a list of all supported languages. Language
support packages must be installed an loaded to be usable with kRp.POS.tags
.
For the internal tokenizer a small subset of tags is also defined,
available through lang="kRp"
.
Finally,
the Universal POS Tags[1] are automatically appended if no matching tag was already defined.
If you don't know the language your text was written in,
the function guess.lang
should be able to detect it.
With the element tags
you can specify if you want all tag definitions, or a subset,
e.g. tags only for punctuation and
sentence endings (that is,
you need to call for both "punct" and "sentc" to get all punctuation tags).
The function is not so much intended to be used directly, but it is called by several other functions internally. However, it can still be useful to directly examine available POS tags.
Value
If list.classes=FALSE
and list.tags=FALSE
returns a matrix with word tag definitions of the given language.
The matrix has three columns:
tag
:Word tag
class
:Respective word class
desc
:"Human readable" description of what the tag stands for
Otherwise a vector with the known word classes or POS tags for the chosen language (and probably tag subset) will be returned.
If both list.classes
and list.tags
are TRUE
,
still only the POS tags will be returned.
References
[1] https://universaldependencies.org/u/pos/index.html
See Also
get.kRp.env
,
available.koRpus.lang
,
install.koRpus.lang
Examples
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
tags.internal <- kRp.POS.tags("kRp")
tags.en <- kRp.POS.tags("en")
} else {}