Hacker News new | ask | show | jobs
by lucasay 83 days ago
This feels less like automated research and more like structured trial and error with a decent feedback loop. Still useful, but I think the real bottleneck is how good your eval metric is. If that’s weak, the whole loop just optimizes for the wrong thing faster.
2 comments

Designing a good fitness function, a tale as old as time...
I mean, isn’t that “the scientific method”?
Partially—but science also questions the hypothesis and the metric. This mostly assumes both are correct and just optimizes within that box.
Only if the model is actually a human or equivalent, otherwise we don’t know what it is.