Hacker News new | ask | show | jobs
by C-x_C-f 32 days ago
> All these models see a local failure and try to locally defend against it. As maintainers we have to keep pulling the conversation back to the global invariant, which is harder than it should be, and it’s laborious.

This has been by far the biggest and costliest failure mode I've experienced using these tools. I've tried to mitigate it in more ways than I can count but it almost feels structurally impossible for LLMs to get this right.