Hacker News new | ask | show | jobs
by mediaman 245 days ago
I have a lot of success asking models such as Gemini to OCR the text, and then to describe any images on the document, including charts. I have it format the sections with XML-ish tags. This also works for tables.