install_tika {rtika} | R Documentation |
Install or Update the Apache Tika jar
Description
This downloads and installs the Tika App jar
(~60 MB) into a user directory,
and verifies the integrity of the file using a checksum.
The default settings should work fine.
Usage
install_tika(
version = "2.7.0",
digest = paste0("7fefbe5570a95900d39193134e8277aec99e5450a8",
"cecbb5787b3d6651ebf735e460ccccddb49bdc2990",
"8a9058fc36e4689aed6da6d63a1cf70ca09ccf26bcca"),
mirrors = c("https://ftp.wayne.edu/apache/tika/",
"http://mirrors.ocf.berkeley.edu/apache/tika/", "http://apache.cs.utah.edu/tika/",
"http://mirror.cc.columbia.edu/pub/software/apache/tika/"),
retries = 2,
url = character()
)
Arguments
version |
The declared Tika version |
digest |
The sha512 checksum. Set to an empty string |
mirrors |
A vector of Apache mirror sites. One is picked randomly. |
retries |
The number of times to try the download. |
url |
Optional url to a particular location of the tika app. Setting this to any character string overrides downloading from random mirrors. |
Value
Logical if the installation was successful.
Details
The default settings of install_tika()
should typically be left as they are.
This function will download the version of the Tika jar
tested to work
with this package, and can verify file integrity using a checksum.
It will normally download from a random Apache mirror.
If the mirror fails,
it tries the archive at http://archive.apache.org/dist/tika/
.
You can also enter a value for url
directly to override this.
It will download into a directory determined
by tools::R_user_dir("rtika", which = "data")
,
specific to the operating system.
If tika()
is stopping with an error compalining about the jar
,
try running install_tika()
again.
Uninstalling
If you are uninstalling the entire rtika
package
and want to remove the Tika App jar
also,
run:
unlink(tools::R_user_dir("rtika", which = "data"), recursive = TRUE)
Alternately, navigate to the install folder and delete it manually.
It is the file path returned by
tools::R_user_dir("rtika", which = "data")
.
The path is OS specific.
Distribution
Tika is distributed under the Apache License Version 2.0, which generally permits distribution of the code "Object" without the "Source". The master copy of the Apache Tika source code is held in GIT. You can fetch (clone) the large source from GitHub ( https://github.com/apache/tika ).
Examples
install_tika()