Hacker News new | ask | show | jobs
by benchwright 42 days ago
Tends to be a problem. I've tried to mitigate these problems by using either external harnesses (aka GitHub actions that are "fixed" based on known-good) or by using n-number of witness agents (e.g. Kimi/Qwen/whatever <=> Claude/OpenAI/Google). Generally sucks more time and energy (and now token/$).

that being said, I still have a "fix the code, not the test" line somewhere in here...