fst_freq {finnsurveytext}R Documentation

Find and Plot Top Words

Description

Creates a plot of the most frequently-occurring words (unigrams) within the data.

Usage

fst_freq(
  data,
  number = 10,
  norm = "number_words",
  pos_filter = NULL,
  strict = TRUE,
  name = NULL
)

Arguments

data

A dataframe of text in CoNLL-U format.

number

The number of top words to return, default is '10'.

norm

The method for normalising the data. Valid settings are '"number_words"' (the number of words in the responses, default), '"number_resp"' (the number of responses), or 'NULL' (raw count returned).

pos_filter

List of UPOS tags for inclusion, default is 'NULL' which means all word types included.

strict

Whether to strictly cut-off at 'number' (ties are alphabetically ordered), default is 'TRUE'.

name

An optional "name" for the plot to add to title, default is 'NULL'.

Value

Plot of top words.

Examples

q11_1 <- conllu_dev_q11_1
n1 <- "number_resp"
fst_freq(q11_1, number = 12, norm = n1, strict = FALSE, name = "All")
fst_freq(q11_1, number = 15, name = "Not Spec")

[Package finnsurveytext version 1.0.0 Index]