get_document_frequencies {dhlabR}R Documentation

Retrieve Token Frequencies in Documents

Description

This function obtains token frequencies within specified documents.

Usage

get_document_frequencies(pids, cutoff = 0, words = NULL)

Arguments

pids

A vector or data frame containing document IDs.

cutoff

A numeric value specifying the frequency cutoff for tokens.

words

A vector of words (tokens) to retrieve frequencies for.

Value

A list containing the following elements for each document:

Examples

document_ids <- c("URN:NBN:no-nb_digibok_2008051404065", "URN:NBN:no-nb_digibok_2010092120011")
frequency_cutoff <- 10
tokens <- c(".", ",", "men")
result <- get_document_frequencies(document_ids, frequency_cutoff, tokens)

[Package dhlabR version 1.0.6 Index]