|
|
|
|
|
by milesokeefe
2811 days ago
|
|
Tika doesn’t do OCR, it only extracts text content from binary files. For an image it’ll only give you metadata and such. A better comparison would be against Tesseract or ABBYY FineReader. EDIT: I wasn't aware that Tika now embeds Tesseract.[1] Still, it's a simple wrapper so the real comparison is against Tesseract. [1] https://wiki.apache.org/tika/TikaOCR |
|