Hacker News new | ask | show | jobs
by llm_nerd 335 days ago
Modern OCR is using machine learning technologies, including ViT and precisely the same models and technologies used in the linked solution. I mean, if their comparison was with OCR from 2002, sure, but they're comparing against modern OCR solutions that generate text representations of documents, using the very latest machine learning innovations and massive models (along with textual transformer-based contextual inferrals), with their own solution which uses precisely the same stack. It's a weird thing for them to continually harp on.

Their solution is precisely as subject to ambiguities of text that the comparative OCR solutions are.