make_hocr {daiR} | R Documentation |
Make hOCR file
Description
Creates a hOCR file from Document AI output.
Usage
make_hocr(type, output, outfile_name = "out.hocr", dir = getwd())
Arguments
type |
one of "sync" or "async" depending on the function used to process the original document. |
output |
either a HTTP response object (from |
outfile_name |
a string with the desired filename. Must end with
either |
dir |
a string with the path to the desired output directory. |
Details
hOCR is an open standard of data representation for formatted text obtained from optical character recognition. It can be used to generate searchable PDFs and many other things. This function generates a file compliant with the official hOCR specification (https://github.com/kba/hocr-spec) complete with token-level confidence scores. It also works with non-latin scripts and right-to-left languages.
Value
no return value, called for side effects.
Examples
## Not run:
make_hocr(type = "async", output = "output.json")
resp <- dai_sync("file.pdf")
make_hocr(type = "sync", output = resp)
make_hocr(type = "sync", output = resp, outfile_name = "myfile.xml")
## End(Not run)