Hacker News new | ask | show | jobs
by rafram 480 days ago
Then use an LLM to extract layout information. Don’t trust it to read the text.

> If the OCR model gives you back 500 words all ranging from 0.70 to 0.95 confidence, what do you do? Reject the entire document if there's a single value below 0.90?

No, of course not. You have a human review the words/segments with low confidence.