space_cjk {piecemaker}R Documentation

Add Spaces Around CJK Ideographs

Description

To tokenize Chinese, Japanese, and Korean (CJK) characters, it's convenient to add spaces around the characters.

Usage

space_cjk(text)

Arguments

text

A character vector to clean.

Value

A character vector the same length as the input text, with spaces added between ideographs.

Examples

to_space <- intToUtf8(13312:13320)
to_space
space_cjk(to_space)

[Package piecemaker version 1.0.2 Index]