PMI {MadanText}R Documentation

Calculate Pointwise Mutual Information (PMI)

Description

This function calculates the PMI for collocations in a given text data.

Usage

PMI(x)

Arguments

x

A data frame with columns 'token' and 'doc_id'.

Value

Returns a data frame where each row represents a unique keyword (collocation) in the input data. The data frame contains columns such as 'keyword', representing the keyword, and 'pmi', representing the PMI score of that keyword. Higher PMI scores indicate a stronger association between the components of the collocation within the corpus.

Examples

data <- data.frame(token = c("word1", "word2"), doc_id = c(1, 1))
pmi_scores <- PMI(data)

[Package MadanText version 0.1.0 Index]