R: Find and Plot Top N-grams

fst_ngrams {finnsurveytext}

R Documentation

Find and Plot Top N-grams

Description

Creates a plot of the most frequently-occurring n-grams within the data.

Usage

fst_ngrams(
  data,
  number = 10,
  ngrams = 1,
  norm = "number_words",
  pos_filter = NULL,
  strict = TRUE,
  name = NULL
)

Arguments

`data`	A dataframe of text in CoNLL-U format.
`number`	The number of top words to return, default is '10'.
`ngrams`	The type of n-grams, default is '1'.
`norm`	The method for normalising the data. Valid settings are '"number_words"' (the number of words in the responses, default), '"number_resp"' (the number of responses), or 'NULL' (raw count returned).
`pos_filter`	List of UPOS tags for inclusion, default is 'NULL' which means all word types included.
`strict`	Whether to strictly cut-off at 'number' (ties are alphabetically ordered), default is 'TRUE'.
`name`	An optional "name" for the plot to add to title, default is 'NULL'.

Value

Plot of top n-grams

Examples

q11_1 <- conllu_dev_q11_1
fst_ngrams(q11_1, 12, ngrams = 2, norm = NULL, strict = FALSE, name = "All")
fst_ngrams(conllu_dev_q11_1_na, number = 15, ngrams = 3, name = "Not Spec")

[Package finnsurveytext version 1.0.0 Index]