fst_cn_edges {finnsurveytext}R Documentation

Concept Network - Get textrank edges

Description

This function takes a string of terms (separated by commas) or a single term and, using 'fst_cn_search()' find words connected to these searched terms. Then, a dataframe is returned of 'edges' between two words which are connected together in an frequently-occurring n-gram containing a concept term.

Usage

fst_cn_edges(
  data,
  concepts,
  threshold = NULL,
  norm = "number_words",
  pos_filter = NULL
)

Arguments

data

A dataframe of text in CoNLL-U format.

concepts

List of terms to search for, separated by commas.

threshold

A minimum number of occurrences threshold for 'edge' between searched term and other word, default is 'NULL'. Note, the threshold is applied before normalisation.

norm

The method for normalising the data. Valid settings are '"number_words"' (the number of words in the responses, default), '"number_resp"' (the number of responses), or 'NULL' (raw count returned).

pos_filter

List of UPOS tags for inclusion, default is 'NULL' to include all UPOS tags.

Value

Dataframe of co-occurrences between two connected words.

Examples

con <- "kiusata, lyöminen"
cb <- conllu_cb_bullying_iso
fst_cn_edges(cb, con, pos_filter = c("NOUN", "VERB", "ADJ", "ADV"))
fst_cn_edges(cb, "lyöminen", threshold = 2, norm = 'number_resp')

[Package finnsurveytext version 1.0.0 Index]