Hacker News new | ask | show | jobs
by hubraumhugo 488 days ago
As someone building in this space, we've found that raw OCR accuracy is just one piece (and it's becomming a commodity).

The real challenge is building reliable and accurate ETL pipelines (document ingestion from web, OCR, classification, validation, etc.) that work at scale in production.

The best products will be defined by everything "non-AI", like UX, performance, and human-in-the loop feedback loop for non-techies.

Avoiding over-reliance on specific models also helps. With good internal eval data and benchmarks, you can easily switch or fine-tune models.

1 comments

That’s the point of using AI in the first place. If your product is just a polished interface on top of a prompt, then your moat isn’t that strong, and chances are your product will be commoditized soon.

By building a good UX and integrating it with other processes that require traditional collaboration, you increase the chances that replicating your secret sauce is either infeasible or too difficult for newcomers to bother.