Hacker News new | ask | show | jobs
by MajidAliSyncOps 165 days ago
Interesting direction. Using evolutionary pressure to improve agent reasoning feels promising, especially beyond static benchmarks. One trade-off I’m curious about is evaluation drift—when tasks co-evolve, how do you ensure you’re not just optimizing for the framework itself rather than real-world generalization?
1 comments

Each task is unique, unless we provide share memory for them. It means when you start a task, it will run the full evolutionary process from the start.