Hacker News new | ask | show | jobs
by raxxorraxor 484 days ago
This has always been part of the complete OCR package as far as I know. The raw result of an OCR constantly fails to differentiate 1 l I i | or other similar symbols/letters.

Maybe this necessary step can be improved and altered with a VLM. There is also the preprocessing where the image get its perspective corrected. Not sure how well a VLM performs here.

As you said, I think combining these techniques will be the most efficient way forward.