Y
Hacker News
new
|
ask
|
show
|
jobs
LLM Evals Are Just Tests. Why Are We Making This So Complicated?
(
cameronwestland.com
)
3 points
by
camwest
307 days ago
1 comments
8organicbits
307 days ago
So, did the tests allow you to build a system that never confused existing features with new features? That seems like the problem statement, but I think I'm only seeing probabilistic testing.
link
camwest
307 days ago
Never? No. Way less likely? Yes!
In dev we do 100 consistency checks and get green. In CI we do 10.
link