Y
Hacker News
new
|
ask
|
show
|
jobs
by
abstract257
8 days ago
Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...
1 comments
krunck
8 days ago
I had to extract the image from a PDF for it to work. Then run it on each page image extracted.
link
abstract257
8 days ago
Thanks
link