Y
Hacker News
new
|
ask
|
show
|
jobs
by
mediaman
245 days ago
I have a lot of success asking models such as Gemini to OCR the text, and then to describe any images on the document, including charts. I have it format the sections with XML-ish tags. This also works for tables.