R: Perform Text Mining of a Given Column

word_cloud_prep {scicomptools}

R Documentation

Perform Text Mining of a Given Column

Description

Mines a user-defined column to create a dataframe that is ready for creating a word cloud. It also identifies any user-defined "bigrams" (i.e., two-word phrases) supplied as a vector.

Usage

word_cloud_prep(
  data = NULL,
  text_column = NULL,
  word_count = 50,
  known_bigrams = c("working group")
)

Arguments

`data`	(dataframe) Data object containing at least one column
`text_column`	(character) Name of column in dataframe given to 'data' that contains the text to be mined
`word_count`	(numeric) Number of words to be returned (counts from most to least frequent)
`known_bigrams`	(character) Vector of all bigrams (two-word phrases) to be mined before mining for single words

Value

dataframe of one column (named 'word') that can be used for word cloud creation. One row per bigram supplied in 'known_bigrams' or single word (not including "stop words")

Examples

# Create a dataframe containing some example text
text <- data.frame(article_num = 1:6,
                   article_title = c("Why pigeons are the best birds",
                                     "10 ways to show your pet budgie love",
                                     "Should you feed ducks at the park?",
                                     "Locations and tips for birdwatching",
                                     "How to tell which pet bird is right for you",
                                     "Do birds make good pets?"))
                                     
# Prepare the dataframe for word cloud plotting              
word_cloud_prep(data = text, text_column = "article_title")

# Plot the word cloud
word_cloud_plot(data = text, text_column = "article_title")

[Package scicomptools version 1.0.0 Index]