add_multitoken_label | Choose and add multitoken strings based on multitoken categories |
aggregate_rsyntax | Aggregate rsyntax annotations |
agg_label | Helper function for aggregate_rsyntax |
agg_tcorpus | Aggregate the tokens data |
annotate_rsyntax | Annotate tokens based on rsyntax queries |
as.tcorpus | Force an object to be a tCorpus class |
as.tcorpus.default | Force an object to be a tCorpus class |
as.tcorpus.tCorpus | Force an object to be a tCorpus class |
backbone_filter | Extract the backbone of a network. |
browse_hits | View hits in a browser |
browse_texts | Create and view a full text browser |
calc_chi2 | Vectorized computation of chi^2 statistic for a 2x2 crosstab containing the values [a, b] [c, d] |
code_dictionary | Dictionary lookup |
code_features | Code features in a tCorpus based on a search string |
compare_corpus | Compare tCorpus vocabulary to that of another (reference) tCorpus |
compare_documents | Calculate the similarity of documents |
compare_subset | Compare vocabulary of a subset of a tCorpus to the rest of the tCorpus |
context | Get a context vector |
corenlp_tokens | coreNLP example sentences |
count_tcorpus | Count results of search hits, or of a given feature in tokens |
create_tcorpus | Create a tCorpus |
create_tcorpus.character | Create a tCorpus |
create_tcorpus.corpus | Create a tCorpus |
create_tcorpus.data.frame | Create a tCorpus |
create_tcorpus.factor | Create a tCorpus |
deduplicate | Deduplicate documents |
delete_columns | Delete column from the data and meta data |
delete_meta_columns | Delete column from the data and meta data |
docfreq_filter | Support function for subset method |
dtm_compare | Compare two document term matrices |
dtm_wordcloud | Plot a word cloud from a dtm |
ego_semnet | Create an ego network |
export_span_annotations | Export span annotations |
feats_to_columms | Cast the "feats" column in UDpipe tokens to columns |
feature_associations | Get common nearby features given a query or query hits |
feature_stats | Feature statistics |
feature_subset | Filter features |
fold_rsyntax | Fold rsyntax annotations |
freq_filter | Support function for subset method |
get | Access the data from a tCorpus |
get_dfm | Create a document term matrix. |
get_dtm | Create a document term matrix. |
get_global_i | Compute global feature positions |
get_kwic | Get keyword-in-context (KWIC) strings |
get_meta | Access the data from a tCorpus |
get_stopwords | Get a character vector of stopwords |
laplace | Laplace (i.e. add constant) smoothing |
lda_fit | Estimate a LDA topic model |
melt_quanteda_dict | Convert a quanteda dictionary to a long data.table format |
merge | Merge the token and meta data.tables of a tCorpus with another data.frame |
merge_meta | Merge the token and meta data.tables of a tCorpus with another data.frame |
merge_tcorpora | Merge tCorpus objects |
plot.contextHits | S3 plot for contextHits class |
plot.featureAssociations | visualize feature associations |
plot.featureHits | S3 plot for featureHits class |
plot.vocabularyComparison | visualize vocabularyComparison |
plot_semnet | Visualize a semnet network |
plot_words | Plot a wordcloud with words ordered and coloured according to a dimension (x) |
preprocess | Preprocess feature |
preprocess_tokens | Preprocess tokens in a character vector |
print.contextHits | S3 print for contextHits class |
print.featureHits | S3 print for featureHits class |
print.tCorpus | S3 print for tCorpus class |
refresh_tcorpus | Refresh a tCorpus object using the current version of corpustools |
replace_dictionary | Replace tokens with dictionary match |
require_package | Check if package with given version exists |
search_contexts | Search for documents or sentences using Boolean queries |
search_dictionary | Dictionary lookup |
search_features | Find tokens using a Lucene-like search query |
search_recode | Recode features in a tCorpus based on a search string |
semnet | Create a semantic network based on the co-occurence of tokens in documents |
semnet_window | Create a semantic network based on the co-occurence of tokens in token windows |
set | Modify the token and meta data.tables of a tCorpus |
set_levels | Change levels of factor columns |
set_meta | Modify the token and meta data.tables of a tCorpus |
set_meta_levels | Change levels of factor columns |
set_meta_name | Change column names of data and meta data |
set_name | Change column names of data and meta data |
set_network_attributes | Set some default network attributes for pretty plotting |
sgt | Simple Good Turing smoothing |
show_udpipe_models | Show the names of udpipe models |
sotu_texts | State of the Union addresses |
stopwords_list | Basic stopword lists |
subset | Subset a tCorpus |
subset.tCorpus | S3 subset for tCorpus class |
subset_meta | Subset a tCorpus |
subset_query | Subset tCorpus token data using a query |
summary.contextHits | S3 summary for contextHits class |
summary.featureHits | S3 summary for featureHits class |
summary.tCorpus | Summary of a tCorpus object |
tCorpus | tCorpus: a corpus class for tokenized texts |
tcorpus | tCorpus: a corpus class for tokenized texts |
tCorpus$annotate_rsyntax | Annotate tokens based on rsyntax queries |
tCorpus$code_dictionary | Dictionary lookup |
tCorpus$code_features | Code features in a tCorpus based on a search string |
tCorpus$context | Get a context vector |
tCorpus$deduplicate | Deduplicate documents |
tCorpus$delete_columns | Delete column from the data and meta data |
tCorpus$delete_meta_columns | Delete column from the data and meta data |
tCorpus$feats_to_columns | Cast the "feats" column in UDpipe tokens to columns |
tCorpus$feature_subset | Filter features |
tCorpus$fold_rsyntax | Fold rsyntax annotations |
tCorpus$get | Access the data from a tCorpus |
tCorpus$get_meta | Access the data from a tCorpus |
tCorpus$lda_fit | Estimate a LDA topic model |
tCorpus$merge | Merge the token and meta data.tables of a tCorpus with another data.frame |
tCorpus$preprocess | Preprocess feature |
tCorpus$replace_dictionary | Replace tokens with dictionary match |
tCorpus$search_recode | Recode features in a tCorpus based on a search string |
tCorpus$set | Modify the token and meta data.tables of a tCorpus |
tCorpus$set_levels | Change levels of factor columns |
tCorpus$set_meta | Modify the token and meta data.tables of a tCorpus |
tCorpus$set_meta_levels | Change levels of factor columns |
tCorpus$set_meta_name | Change column names of data and meta data |
tCorpus$set_name | Change column names of data and meta data |
tCorpus$subset | Subset a tCorpus |
tCorpus$subset_meta | Subset a tCorpus |
tCorpus$subset_query | Subset tCorpus token data using a query |
tCorpus$udpipe_clauses | Add columns indicating who did what |
tCorpus$udpipe_quotes | Add columns indicating who said what |
tCorpus_compare | Corpus comparison |
tCorpus_create | Creating a tCorpus |
tCorpus_data | Methods and functions for viewing, modifying and subsetting tCorpus data |
tCorpus_docsim | Document similarity |
tCorpus_features | Preprocessing, subsetting and analyzing features |
tCorpus_modify_by_reference | Modify tCorpus by reference |
tCorpus_querying | Use Boolean queries to analyze the tCorpus |
tCorpus_semnet | Feature co-occurrence based semantic network analysis |
tCorpus_topmod | Topic modeling |
tc_plot_tree | Visualize a dependency tree |
tc_sotu_udpipe | A tCorpus with a small sample of sotu paragraphs parsed with udpipe |
tokens_to_tcorpus | Create a tcorpus based on tokens (i.e. preprocessed texts) |
tokenWindowOccurence | Gives the window in which a term occured in a matrix. |
top_features | Show top features |
transform_rsyntax | Apply rsyntax transformations |
udpipe_clauses | Add columns indicating who did what |
udpipe_clause_tqueries | Get a list of tqueries for extracting who did what |
udpipe_quotes | Add columns indicating who said what |
udpipe_quote_tqueries | Get a list of tqueries for extracting quotes |
udpipe_simplify | Simplify tokenIndex created with the udpipe parser |
udpipe_spanquote_tqueries | Get a list of tqueries for finding candidates for span quotes. |
udpipe_tcorpus | Create a tCorpus using udpipe |
udpipe_tcorpus.character | Create a tCorpus using udpipe |
udpipe_tcorpus.corpus | Create a tCorpus using udpipe |
udpipe_tcorpus.data.frame | Create a tCorpus using udpipe |
udpipe_tcorpus.factor | Create a tCorpus using udpipe |
untokenize | Reconstruct original texts |