Hacker News new | ask | show | jobs
by serjester 596 days ago
Honest question but how do you see your business being affected as foundational models improve? While I have massive complaints about them, Gemini + structured outputs is working remarkably well for this internally and it's only getting better. It's also an order of magnitude cheaper than anything I've seen commercially.
2 comments

We're excited for foundational models to improve because we hope it will unlock a lot more use cases. Things like analysis after extraction, able to accurately extract extremely complex documents, etc!
Curious - have you compared Gemini against Anthropic and OpenAI’s offerings here? Am needing to do something similar for a one-off task and simply need to choose a model to use.
Gemini is an awful developer experience but accuracy for OCR tasks is close to perfect. The pricing is also basically unbeatable - works out to 1k 10k pages per dollar depending on the model. OpenAI has subtle hallucinations and I haven’t profiled Anthropic.
If I may ask which model are you using? I have tried OCR'ing my bank statements in AI studio and the results have been less than optimal. Specifically it has a tendency to ignore certain instructions combined with screwing up the order.

Some pointers on what worked for you would be greatly appreciated.

Thanks!