| bow_pp_create_vocab_draft {aifeducation} | R Documentation |
Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.
Description
Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.
Usage
bow_pp_create_vocab_draft(
path_language_model,
data,
upos = c("NOUN", "ADJ", "VERB"),
label_language_model = NULL,
language = NULL,
chunk_size = 100,
trace = TRUE
)
Arguments
path_language_model |
|
data |
|
upos |
|
label_language_model |
|
language |
|
chunk_size |
|
trace |
|
Value
list with the following components.
vocab:data.framecontaining the tokens, lemmas, tokens in lower case, and lemmas in lower case.ud_language_modeludpipe language model that is used for tagging.label_language_modelLabel of the udpipe language model.languageLanguage of the raw texts.uposUsed univerisal part-of-speech tags.n_sentenceintEstimated number of sentences in the raw texts.n_tokenintEstimated number of tokens in the raw texts.n_document_segmentsintEstimated number of document segments/raw texts.
Note
A list of possible tags can be found here: https://universaldependencies.org/u/pos/index.html.
A huge number of models can be found here: https://ufal.mff.cuni.cz/udpipe/2/models.
See Also
Other Preparation:
bow_pp_create_basic_text_rep()