collapse_tokens {audubon} | R Documentation |
Collapse sequences of tokens by condition
Description
Concatenates sequences of tokens in the tidy text dataset, while grouping them by an expression.
Usage
collapse_tokens(tbl, condition, .collapse = "")
Arguments
tbl |
A tidy text dataset. |
condition |
< |
.collapse |
String with which tokens are concatenated. |
Details
Note that this function drops all columns except but 'token' and columns for grouping sequences. So, the returned data.frame has only 'doc_id', 'sentence_id', 'token_id', and 'token' columns.
Value
A data.frame.
Examples
df <- prettify(head(hiroba), col_select = "POS1")
collapse_tokens(df, POS1 == "\u540d\u8a5e")
[Package audubon version 0.5.2 Index]