f6 {MadanTextNetwork} | R Documentation |
Extract Token Information from Data Frame
Description
This function extracts token, lemma, and part-of-speech (POS) tag information from a given data frame and compiles them into a new data frame.
Usage
f6(UPIP)
Arguments
UPIP |
A data frame containing columns 'token', 'lemma', and 'upos' for tokens, their lemmatized forms, and POS tags respectively. |
Value
Returns a new data frame with three columns: 'TOKEN', 'LEMMA', and 'TYPE'. 'TOKEN' contains the original tokens from the 'token' column of the input data frame. 'LEMMA' contains the lemmatized forms of these tokens, as provided in the 'lemma' column. 'TYPE' contains POS tags corresponding to each token, as provided in the 'upos' column. The returned data frame has the same number of rows as the input data frame, with each row representing the token, its lemma, and its POS tag from the corresponding row of the input.
Examples
data <- data.frame(token = c("running", "jumps"),
lemma = c("run", "jump"),
upos = c("VERB", "VERB"))
token_info <- f6(data)