Y
Hacker News
new
|
ask
|
show
|
jobs
by
tensor
480 days ago
100% this, combining traditional OCR with VLMs that can work with bounding boxes so that you can correlate the two is the way to go.