bow_pp_create_vocab_draft {aifeducation} | R Documentation |
Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.
Description
Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.
Usage
bow_pp_create_vocab_draft(
path_language_model,
data,
upos = c("NOUN", "ADJ", "VERB"),
label_language_model = NULL,
language = NULL,
chunk_size = 100,
trace = TRUE
)
Arguments
path_language_model |
|
data |
|
upos |
|
label_language_model |
|
language |
|
chunk_size |
|
trace |
|
Value
list
with the following components.
vocab:
data.frame
containing the tokens, lemmas, tokens in lower case, and lemmas in lower case.ud_language_model
udpipe language model that is used for tagging.label_language_model
Label of the udpipe language model.language
Language of the raw texts.upos
Used univerisal part-of-speech tags.n_sentence
int
Estimated number of sentences in the raw texts.n_token
int
Estimated number of tokens in the raw texts.n_document_segments
int
Estimated number of document segments/raw texts.
Note
A list of possible tags can be found here: https://universaldependencies.org/u/pos/index.html.
A huge number of models can be found here: https://ufal.mff.cuni.cz/udpipe/2/models.
See Also
Other Preparation:
bow_pp_create_basic_text_rep()