tCorpus_data {corpustools} R Documentation

## Methods and functions for viewing, modifying and subsetting tCorpus data

### Details

Get data

 $get() Get (by default deep copy) token data, with the possibility to select columns and subset. Instead of copying you can also access the token data with tc$tokens $get_meta() Get meta data, with the possibility to select columns and subset. Like tokens, you can also access meta data with tc$meta get_dtm() Create a document term matrix get_dfm() Create a document term matrix, using the Quanteda dfm format $context() Get a context vector. Currently supports documents or globally unique sentences. Modify The token and meta data can be modified with the set* and delete* methods. All modifications are performed by reference. $set() Modify the token data by setting the values of one (existing or new) column. $set_meta() The set method for the document meta data$set_levels() Change the levels of factor columns. $set_meta_levels() Change the levels of factor columns in the meta data$set_name() Modify column names of token data. $set_meta_name() Delete columns in the meta data$delete_columns() Delete columns. $delete_meta_columns() Delete columns in the meta data Modifying is restricted in certain ways to ensure that the data always meets the assumptions required for tCorpus methods. tCorpus automatically tests whether assumptions are violated, so you don't have to think about this yourself. The most important limitations are that you cannot subset or append the data. For subsetting, you can use the tCorpus$subset method, and to add data to a tcorpus you can use the merge_tcorpora function.

 subset() Modify the token and/or meta data using the subset function. A subset expression can be specified for both the token data (subset) and the document meta data (subset_meta). subset_query() Subset the tCorpus based on a query, as used in search_contexts $subset() Like subset, but as an R6 method that changes the tCorpus by reference$subset_query() Like subset_query, but as an R6 method that changes the tCorpus by reference
 $n The number of tokens (i.e. rows in the data)$n_meta The number of documents (i.e. rows in the document meta data) $names The names of the token data columns$names_meta The names of the document meta data columns