latexify {dplR} | R Documentation |
Convert Character Strings for Use with LaTeX
Description
Some characters cannot be entered directly into a LaTeX document.
This function converts the input character
vector to a form
suitable for inclusion in a LaTeX document in text mode. It can be
used together with ‘\Sexpr’ in vignettes.
Usage
latexify(x, doublebackslash = TRUE, dashdash = TRUE,
quotes = c("straight", "curved"),
packages = c("fontenc", "textcomp"))
Arguments
x |
a |
doublebackslash |
a |
dashdash |
a |
quotes |
a |
packages |
a |
Details
The function is intended for use with unformatted inline text.
Newlines, tabs and other whitespace characters ("[:space:]"
in
regex) are converted to spaces. Control characters
("[:cntrl:]"
) that are not whitespace are removed. Other more
or less special characters in the ASCII set are ‘{’,
‘}’, ‘\’, ‘#’, ‘$’, ‘%’,
‘^’, ‘&’, ‘_’, ‘~’, double quote,
‘/’, single quote, ‘<’, ‘>’, ‘|’, grave
and ‘-’. They are converted to the corresponding LaTeX
commands. Some of the conversions are affected by user options,
e.g. dashdash
.
Before applying the substitutions described above, input elements with
Encoding
set to "bytes"
are printed and the
output is stored using captureOutput
. The result of
this intermediate stage is ASCII text where some characters
are shown as their byte codes using a hexadecimal pair prefixed with
"\x"
. This set includes tabs, newlines and control
characters. The substitutions are then applied to the intermediate
result.
The quoting functions sQuote
and dQuote
may use non-ASCII quote characters, depending on the locale.
Also these quotes are converted to LaTeX commands. This means that
the quoting functions are safe to use with any LaTeX input encoding.
Similarly, some other non-ASCII characters, e.g. letters,
currency symbols, punctuation marks and diacritics, are converted to
commands.
Adding "eurosym"
to packages
enables the use of the
euro sign as provided by the "eurosym"
package (‘\euro’).
The result is converted to UTF-8 encoding, Normalization Form C (NFC).
Note that this function will not add any non-ASCII
characters that were not already present in the input. On the
contrary, some non-ASCII characters, e.g. all characters in
the "latin1"
(ISO-8859-1) Encoding
(character set), are removed when converted to LaTeX commands. Any
remaining non-ASCII character has a good chance of working
when the document is processed with XeTeX or LuaTeX, but the Unicode
support available with pdfTeX is limited.
Assuming that ‘pdflatex’ is used for compilation, suggested package loading commands in the document preamble are:
\usepackage[T1]{fontenc} % no '"' in OT1 font encoding \usepackage{textcomp} % some symbols e.g. straight single quote \usepackage[utf8]{inputenx} % UTF-8 input encoding \input{ix-utf8enc.dfu} % more supported characters
Value
A character
vector
Author(s)
Mikko Korpela
References
INRIA. Tralics: a LaTeX to XML translator, HTML documentation of all TeX commands. https://www-sop.inria.fr/marelle/tralics/.
Levitt, N., Persch, C., and Unicode, Inc. (2013) GNOME Character Map, application version 3.10.1.
Mittelbach, F., Goossens, M., Braams, J., Carlisle, D., and Rowley, C. (2004) The LaTeX Companion. Addison-Wesley, second edition. ISBN-13: 978-0-201-36299-2.
Pakin, S. (2009) The Comprehensive LaTeX Symbol List. https://www.ctan.org/tex-archive/info/symbols/comprehensive.
The Unicode Consortium. The Unicode Standard. https://home.unicode.org/.
Examples
x1 <- "clich\xe9\nma\xf1ana"
Encoding(x1) <- "latin1"
x1
x2 <- x1
Encoding(x2) <- "bytes"
x2
x3 <- enc2utf8(x1)
testStrings <-
c("different kinds\nof\tspace",
"control\a characters \ftoo",
"{braces} and \\backslash",
'#various$ %other^ &characters_ ~escaped"/coded',
x1,
x2,
x3)
latexStrings <- latexify(testStrings, doublebackslash = FALSE)
## All should be "unknown"
Encoding(latexStrings)
cat(latexStrings, sep="\n")
## Input encoding does not matter
identical(latexStrings[5], latexStrings[7])