wrap_documents {tokenbrowser}R Documentation

Wrap tokens into document html strings

Description

Pastes the tokens into articles, and returns an <article> html element.

Usage

wrap_documents(
  tokens,
  meta,
  doc_col = "doc_id",
  token_col = "token",
  space_col = NULL,
  nav = doc_col,
  token_nav = NULL,
  top_nav = NULL,
  thres_nav = NULL
)

Arguments

tokens

A data.frame with a column for document ids (doc_col) and a column for tokens (token_col)

meta

A data.frame with a column for document_ids (doc_col). All other columns are added to the browser as document meta

doc_col

The name of the document id column

token_col

The name of the token column

space_col

Optionally, a column with space indications (e.g., newline) per token (which is how some NLP parsers indicate spaces)

nav

The column in meta used for nav. Defaults to 'doc_id'

token_nav

Alternative to nav (which uses meta), a column in tokens used for navigation

top_nav

If token_nav is used, navigation filters will only apply to the top x values with highest token occurence in a document

thres_nav

Like top_nav, but specifying a threshold for the minimum number of tokens.

Value

A named vector, with document ids as names and the document html strings as values

Examples

docs = wrap_documents(sotu_data$tokens, sotu_data$meta)
head(names(docs))
docs[[1]]

[Package tokenbrowser version 0.1.5 Index]