AsciiToInt {sfsmisc} | R Documentation |
Character to and from Integer Codes Conversion
Description
AsciiToInt
returns integer
codes in 0:255
for each (one byte) character in strings
. ichar
is an
alias for it, for old S compatibility.
strcodes
implements in R the basic engine for translating
characters to corresponding integer codes.
chars8bit()
is the inverse function of
AsciiToint
, producing “one byte” characters from integer
codes. Note that it (and hence strcodes()
depends on the
locale, see Sys.getlocale()
.
Usage
AsciiToInt(strings)
ichar(strings)
chars8bit(i = 1:255)
strcodes(x, table = chars8bit(1:255))
Arguments
strings , x |
|
i |
numeric (integer) vector of values in |
table |
a vector of (unique) character strings, typically of one character each. |
Details
Only codes in 1:127
make up the ASCII encoding which should be
identical for all R versions, whereas the ‘upper’ half
is often determined from the ISO-8859-1 (aka “ISO-Latin 1)”
encoding, but may well differ, depending on the locale setting, see
also Sys.setlocale
.
Note that 0
is no longer allowed since, R does not allow
\0
aka nul
characters in a string anymore.
Value
AsciiToInt
(and hence ichar
) and chars8bit
return a
vector of the same length as their argument.
strcodes(x, tab)
returns a list
of the same
length
and names
as x
with list
components of integer vectors with codes in 1:255
.
Author(s)
Martin Maechler, partly in 1991 for S-plus
Examples
chars8bit(65:70)#-> "A" "B" .. "F"
stopifnot(identical(LETTERS, chars8bit(65:90)),
identical(AsciiToInt(LETTERS), 65:90))
## may only work in ISO-latin1 locale (not in UTF-8):
try( strcodes(c(a= "ABC", ch="1234", place = "Zürich")) )
## in "latin-1" gives {otherwise should give NA instead of 252}:
## Not run:
$a
[1] 65 66 67
$ch
[1] 49 50 51 52
$place
[1] 90 252 114 105 99 104
## End(Not run)
myloc <- Sys.getlocale()
if(.Platform $ OS.type == "unix") withAutoprint({ # ''should work'' here
try( Sys.setlocale(locale = "de_CH") )# "try": just in case
strcodes(c(a= "ABC", ch="1234", place = "Zürich")) # no NA hopefully
AsciiToInt(chars8bit()) # -> 1:255 {if setting latin1 succeeded above}
chars8bit(97:140)
try( Sys.setlocale(locale = "de_CH.utf-8") )# "try": just in case
chars8bit(97:140) ## typically looks different than above
})
## Resetting to original locale .. works "mostly":
lapply(strsplit(strsplit(myloc, ";")[[1]], "="),
function(cc) try(Sys.setlocale(cc[1], cc[2]))) -> .scratch
Sys.getlocale() == myloc # TRUE if we have succeeded to reset it