Hacker News new | ask | show | jobs
by brianjking 746 days ago
LLMWhisperer from Zipstack at https://llmwhisperer.unstract.com/ or https://github.com/VikParuchuri/surya will do a good job for you.

LLMWhisperer has some nice tooling where they can fall back to OCR as well forcing text extraction from scanned documents as well as documents that have the text preserved as text.