Hacker News new | ask | show | jobs
by PashaGo 2 hours ago
It would be nice to see some metrics. I think the missing layer here is evaluation. If agents are going to produce applications, the platform needs not only guardrails, but public-ish evidence that those guardrails actually catch failures
1 comments

I fully agree