bind_lr | Bind importance of bigrams |
bind_tf_idf2 | Bind term frequency and inverse document frequency |
collapse_tokens | Collapse sequences of tokens by condition |
get_dict_features | Get dictionary's features |
hiroba | Whole tokens of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko |
lex_density | Calculate lexical density |
mute_tokens | Mute tokens by condition |
ngram_tokenizer | Ngrams tokenizer |
pack | Pack a data.frame of tokens |
polano | Whole text of 'Porano no Hiroba' written by Miyazawa Kenji from Aozora Bunko |
prettify | Prettify tokenized output |
read_rewrite_def | Read a rewrite.def file |
strj_fill_iter_mark | Fill Japanese iteration marks |
strj_hiraganize | Hiraganize Japanese characters |
strj_katakanize | Katakanize Japanese characters |
strj_normalize | Convert text following the rules of 'NEologd' |
strj_rewrite_as_def | Rewrite text using rewrite.def |
strj_romanize | Romanize Japanese Hiragana and Katakana |
strj_segment | Segment text into tokens |
strj_tinyseg | Segment text into phrases |
strj_tokenize | Split text into tokens |
strj_transcribe_num | Transcribe Arabic to Kansuji |