Hacker News new | ask | show | jobs
by vicchenai 74 days ago
The monitoring and evaluation piece is underrated. In my experience the hardest part isn't building the initial LLM pipeline, it's knowing when the thing quietly broke. Domain expertise matters a lot there because you need to design evals that actually catch the failure modes that matter for your specific data distribution.