tCorpus$code_features {corpustools}R Documentation

Code features in a tCorpus based on a search string

Description

like search_features, but instead of return hits only adds a column to the token data that contains a code (the query label) for tokens that match the query. Note that only one code can be assigned to each token, so if there are overlapping results for different queries, the code for the last query will be used. This means that the order of queries (in the query argument) matters.

Usage:

## R6 method for class tCorpus. Use as tc$method (where tc is a tCorpus object).

code_features(query, code=NULL, feature='token', column='code', ...)

Arguments

query

A character string that is a query. See search_features for documentation of the query language.

code

The code given to the tokens that match the query (usefull when looking for multiple queries). Can also put code label in query with # (see details)

feature

The name of the feature column within which to search.

column

The name of the column that is added to the data

add_column

list of name-value pairs, used to add additional columns. The name will become the column name, and the value should be a vector of the same length as the query vector.

context_level

Select whether the queries should occur within while "documents" or specific "sentences".

as_ascii

if TRUE, perform search in ascii.

verbose

If TRUE, progress messages will be printed

overwrite

If TRUE (default) and column already exists, overwrite previous results.

...

alternative way to specify name-value pairs for adding additional columns

Examples

tc = create_tcorpus('Anna and Bob are secretive')

tc$code_features(c("actors# anna bob", "associations# secretive"))
tc$tokens

[Package corpustools version 0.5.1 Index]