bow_pp_create_vocab_draft {aifeducation}R Documentation

Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.

Description

Function for creating a first draft of a vocabulary This function creates a list of tokens which refer to specific universal part-of-speech tags (UPOS) and provides the corresponding lemmas.

Usage

bow_pp_create_vocab_draft(
  path_language_model,
  data,
  upos = c("NOUN", "ADJ", "VERB"),
  label_language_model = NULL,
  language = NULL,
  chunk_size = 100,
  trace = TRUE
)

Arguments

path_language_model

string Path to a udpipe language model that should be used for tagging and lemmatization.

data

vector containing the raw texts.

upos

vector containing the universal part-of-speech tags which should be used to build the vocabulary.

label_language_model

string Label for the udpipe language model used.

language

string Name of the language (e.g., English, German)

chunk_size

int Number of raw texts which should be processed at once.

trace

bool TRUE if information about the progress should be printed to console.

Value

list with the following components.

Note

A list of possible tags can be found here: https://universaldependencies.org/u/pos/index.html.

A huge number of models can be found here: https://ufal.mff.cuni.cz/udpipe/2/models.

See Also

Other Preparation: bow_pp_create_basic_text_rep()


[Package aifeducation version 0.3.3 Index]