change.encoding {stylo} | R Documentation |
Change character encoding
Description
This function is a wrapper around iconv()
that allows for converting
character encoding of multiple text files in a corpus folder, preferably
into UTF-8.
Usage
change.encoding(corpus.dir = "corpus/", from, to = "utf-8",
keep.original = TRUE, output.dir = NULL)
Arguments
corpus.dir |
path to the folder containing the corpus. |
from |
original character encoding. See the Details section (below) for some hints on how to get the original encoding. |
to |
character encoding to convert into. |
keep.original |
shall the original files be stored? |
output.dir |
folder for the reencoded files. |
Details
Stylo works on UTF-8-enconded texts by default. This function allows you to convert your corpus, if not yet encoded in UTF-8. To check the current encoding of text files in your corpus folder, you can use the function check.encoding()
.
Value
The function saves reencoded text files.
Author(s)
Steffen Pielström
See Also
Examples
## Not run:
# To replace the old versions with the newly encoded, but retain them
# in another folder:
change.encoding = function(corpus.dir = "~/corpora/example/",
from = "ASCII", to = "utf-8")
# To place the new version in another folder called "utf8/":
change.encoding = function(corpus.dir = "~/corpora/example/",
from = "ASCII",
to = "utf-8",
output.dir = "utf8/")
# To simply replace the old version:
change.encoding = function(corpus.dir = "~/corpora/example/",
from = "ASCII",
to = "utf-8",
keep.original = FALSE)
## End(Not run)
[Package stylo version 0.7.5 Index]