Hacker News new | ask | show | jobs
by leechii1337 74 days ago
how does this compare to e.g. docling, mineru. hard to keep track of all the ocr libs that are being posted.
1 comments

Docling and MinerU are great for structured output like markdown and table extraction, but they run at 1-5 pages/s because of the VLMs under the hood.

Turbo-OCR gives you bounding boxes, text, and layout regions at multiple hundred img/s depending on the text density. When you have many PDFs to process, it makes a huge difference. You can always pipe the output into a VLM for the pages that need deeper extraction. Structured extraction and markdown output are on the roadmap (without sacrificing too much speed).