f7 {MadanTextNetwork} | R Documentation |
Extract and Count Specific Parts of Speech
Description
This function extracts tokens of a specified part of speech (POS) from the given data frame and counts their frequency.
Usage
f7(UPIP, type)
Arguments
UPIP |
A data frame with columns 'upos' (POS tags) and 'lemma' (lemmatized tokens). |
type |
A string representing the POS to filter (e.g., 'NOUN', 'VERB'). |
Value
Returns a data frame where each row corresponds to a unique lemma of the specified POS type. The data frame has two columns: 'key', which contains the lemma, and 'freq', which contains the frequency count of that lemma in the data. The rows are ordered in decreasing frequency of occurrence. This format is useful for quickly identifying the most common terms of a particular POS in the data.
Examples
data <- data.frame(upos = c('NOUN', 'VERB'), lemma = c('house', 'run'))
noun_freq <- f7(data, 'NOUN')
[Package MadanTextNetwork version 0.1.0 Index]