subset.tCorpus {corpustools} | R Documentation |
S3 subset for tCorpus class
Description
S3 subset for tCorpus class
Usage
## S3 method for class 'tCorpus'
subset(x, subset = NULL, subset_meta = NULL, window = NULL, ...)
Arguments
x |
a tCorpus object |
subset |
logical expression indicating rows to keep in the tokens data. |
subset_meta |
logical expression indicating rows to keep in the document meta data. |
window |
If not NULL, an integer specifiying the window to be used to return the subset. For instance, if the subset contains token 10 in a document and window is 5, the subset will contain token 5 to 15. Naturally, this does not apply to subset_meta. |
... |
not used |
Examples
## create tcorpus of 5 bush and obama docs
tc = create_tcorpus(sotu_texts[c(1:5,801:805),], doc_col='id')
## subset to keep only tokens where token_id <= 20 (i.e.first 20 tokens)
tcs1 = subset(tc, token_id < 20)
tcs1
## subset to keep only documents where president is Barack Obama
tcs2 = subset(tc, subset_meta = president == 'Barack Obama')
tcs2
[Package corpustools version 0.5.1 Index]