| tika_html {rtika} | R Documentation |
Get Structured XHTML
Description
If output_dir is specified, files will have the .html file extension.
Usage
tika_html(input, ...)
Arguments
input |
Character vector describing the paths and/or urls to the input documents. |
... |
Other parameters to be sent to |
Value
A character vector in the same order and with the same length as input, of unparsed XHTML. Unprocessed files are as.character(NA).
Examples
batch <- c(
system.file("extdata", "jsonlite.pdf", package = "rtika"),
system.file("extdata", "curl.pdf", package = "rtika"),
system.file("extdata", "table.docx", package = "rtika"),
system.file("extdata", "xml2.pdf", package = "rtika"),
system.file("extdata", "R-FAQ.html", package = "rtika"),
system.file("extdata", "calculator.jpg", package = "rtika"),
system.file("extdata", "tika.apache.org.zip", package = "rtika")
)
html <- tika_html(batch)
[Package rtika version 2.7.0 Index]