stri_enc_set {stringi} | R Documentation |
Set or Get Default Character Encoding in stringi
Description
stri_enc_set
sets the encoding used to re-encode strings
internally (i.e., by R) declared to be in native encoding,
see stringi-encoding and stri_enc_mark
.
stri_enc_get
returns the currently used default encoding.
Usage
stri_enc_set(enc)
stri_enc_get()
Arguments
enc |
single string; character encoding name,
see |
Details
stri_enc_get
is the same as
stri_enc_info(NULL)$Name.friendly
.
Note that changing the default encoding may have undesired consequences.
Unless you are an expert user and you know what you are doing,
stri_enc_set
should only be used if ICU fails to detect
your system's encoding correctly (while testing stringi
we only encountered such a situation on a very old Solaris machine).
Note that ICU tries to match the encoding part of the LC_CTYPE
category as given by Sys.getlocale
.
If you set a default encoding that is neither a superset of ASCII, nor an 8-bit encoding, a warning will be generated, see stringi-encoding for discussion.
stri_enc_set
has no effect if the system ICU assumes that
the default charset is always UTF-8 (i.e., where the internal
U_CHARSET_IS_UTF8
is defined and set to 1), see
stri_info
.
Value
stri_enc_set
returns a string with
previously used character encoding, invisibly.
stri_enc_get
returns a string with current default character
encoding.
Author(s)
Marek Gagolewski and other contributors
See Also
The official online manual of stringi at https://stringi.gagolewski.com/
Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, doi:10.18637/jss.v103.i02
Other encoding_management:
about_encoding
,
stri_enc_info()
,
stri_enc_list()
,
stri_enc_mark()