unhandled_ctl {fansi} | R Documentation |
Identify Unhandled Control Sequences
Description
Will return position and types of unhandled Control Sequences in a
character vector. Unhandled sequences may cause fansi
to interpret strings
in a way different to your display. See fansi for details. Functions that
interpret Special Sequences (CSI SGR or OSC hyperlinks) might omit bad
Special Sequences or some of their components in output substrings,
particularly if they are leading or trailing. Some functions are more
tolerant of bad inputs than others. For example nchar_ctl
will not
report unsupported colors because it only cares about counts or widths.
unhandled_ctl
will report all potentially problematic sequences.
Usage
unhandled_ctl(x, term.cap = getOption("fansi.term.cap", dflt_term_cap()))
Arguments
x |
character vector |
term.cap |
character a vector of the capabilities of the terminal, can
be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes
starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with
"38;2" or "48;2"), and "all". "all" behaves as it does for the |
Details
To work around tabs present in input, you can use tabs_as_spaces
or the
tabs.as.spaces
parameter on functions that have it, or the strip_ctl
function to remove the troublesome sequences. Alternatively, you can use
warn=FALSE
to suppress the warnings.
This is a debugging function that is not optimized for speed and the precise
output of which might change with fansi
versions.
The return value is a data frame with five columns:
index: integer the index in
x
with the unhandled sequencestart: integer the start position of the sequence (in characters)
stop: integer the end of the sequence (in characters), but note that if there are multiple ESC sequences abutting each other they will all be treated as one, even if some of those sequences are valid.
error: the reason why the sequence was not handled:
unknown-substring: SGR substring with a value that does not correspond to a known SGR code or OSC hyperlink with unsupported parameters.
invalid-substr: SGR contains uncommon characters in ":<=>", intermediate bytes, other invalid characters, or there is an invalid subsequence (e.g. "ESC[38;2m" which should specify an RGB triplet but does not). OSCs contain invalid bytes, or OSC hyperlinks contain otherwise valid OSC bytes in 0x08-0x0d.
exceed-term-cap: contains color codes not supported by the terminal (see term_cap_test). Bright colors with color codes in the 90-97 and 100-107 range in terminals that do not support them are not considered errors, whereas 256 or truecolor codes in terminals that do not support them are. This is because the latter are often misinterpreted by terminals that do not support them, whereas the former are typically silently ignored.
CSI/OSC: a non-SGR CSI sequence, or non-hyperlink OSC sequence.
CSI/OSC-bad-substr: a CSI or OSC sequence containing invalid characters.
malformed-CSI/OSC: a malformed CSI or OSC sequence, typically one that never encounters its closing sequence before the end of a string.
non-CSI/OSC: a non-CSI or non-OSC escape sequence, i.e. one where the ESC is followed by something other than "[" or "]". Since we assume all non-CSI sequences are only 2 characters long include the ESC, this type of sequence is the most likely to cause problems as some are not actually two characters long.
malformed-ESC: a malformed two byte ESC sequence (i.e. one not ending in 0x40-0x7e).
C0: a "C0" control character (e.g. tab, bell, etc.).
malformed-UTF8: illegal UTF8 encoding.
non-ASCII: non-ASCII bytes in escape sequences.
translated: whether the string was translated to UTF-8, might be helpful in odd cases were character offsets change depending on encoding. You should only worry about this if you cannot tie out the
start
/stop
values to the escape sequence shown.esc: character the unhandled escape sequence
Value
Data frame with as many rows as there are unhandled escape sequences and columns containing useful information for debugging the problem. See details.
Note
Non-ASCII strings are converted to UTF-8 encoding.
See Also
?fansi
for details on how Control Sequences are
interpreted, particularly if you are getting unexpected results,
unhandled_ctl
for detecting bad control sequences.
Examples
string <- c(
"\033[41mhello world\033[m", "foo\033[22>m", "\033[999mbar",
"baz \033[31#3m", "a\033[31k", "hello\033m world"
)
unhandled_ctl(string)