u_char_names {Unicode} | R Documentation |
Unicode Character Names
Description
Find the names or labels of Unicode characters, or Unicode characters by their name.
Usage
u_char_name(x)
u_char_from_name(x, type = c("exact", "grep"), ...)
u_char_label(x)
Arguments
x |
an R object which can be coerced to a |
type |
one of |
... |
arguments to be passed to |
Details
The Unicode Standard provides a convention for labeling code points
that do not have character names (control, reserved, noncharacter,
private-use and surrogate code points). These labels can be obtained
by u_char_label
.
By default, exact matching is used for finding Unicode characters by
name. When type = "grep"
, grepl
is used for
matching x
against the Unicode character names; for now, Hangul
syllable and CJK Unified Ideograph names are ignored in this case.
Value
For u_char_name
and u_char_label
, a character vector
with the names or labels, respectively, of the corresponding Unicode
characters.
For u_char_from_name
, a u_char
object giving the
Unicode characters with name exactly matching the given names.
Examples
x <- as.u_char(utf8ToInt("Austria"))
u_char_name(x)
## Derived Hangul syllable character names are also supported for
## finding characters by exact matching:
x <- u_char_name("0xAC00")
x
u_char_from_name(x)
## Find all Unicode characters with name matching 'DIGIT ONE'.
x <- u_char_from_name("\\bDIGIT ONE\\b", "g")
## And show their names.
u_char_name(x)