space_cjk {piecemaker} | R Documentation |
Add Spaces Around CJK Ideographs
Description
To tokenize Chinese, Japanese, and Korean (CJK) characters, it's convenient to add spaces around the characters.
Usage
space_cjk(text)
Arguments
text |
A character vector to clean. |
Value
A character vector the same length as the input text, with spaces added between ideographs.
Examples
to_space <- intToUtf8(13312:13320)
to_space
space_cjk(to_space)
[Package piecemaker version 1.0.2 Index]