|
|
|
|
|
by refulgentis
641 days ago
|
|
GPT 4o doesn't do actual OCR and there's much smaller and more effective models for specifically this problem. I appreciate your work, intent, and sharing it. It's very important to appreciate what you're doing and its context when sharing it. At that point, you are responsible for it, and the choices you make when communicating about it reflect on you. |
|
I've been testing it out on pitch decks made in Figma and saved as JPGs. Surprisingly, the LLM OCR outperformed top dogs like SolidDocuments and PDFtron. Since I'm mainly after getting good context for the LLM from PDFs, I've been using this hybrid setup, bringing in the LLM OCR for pages that need it. In my book, this API is perfect for these kinds of situations.