Hacker News new | ask | show | jobs
by authorfly 640 days ago
What about combining old school OCR with GPT visual OCR?

If your old school OCR output has output that is not present in the visual one, but is coherent (e.g. english sentences), you could get it back and slot it into the missing place from the visual output.

1 comments

You're absolutely right. I use PDFTron (through CloudCovert) for full document OCR, but for pages with fewer than 100 characters, I switch to this API. It's a great combo – I get the solid OCR performance of SolidDocument for most content, but I can also handle tricky stuff like stats, old-fashioned text, or handwriting that regular OCR struggles with. That's why I added page numbers upfront.