Hacker News new | ask | show | jobs
by tlofreso 589 days ago
Having the text (for now) is still pretty important for quality output. The vision models are quite good, but not a replacement for a quality OCR step. A combination of Text + Vision is compelling too.