|
|
|
|
|
by monooso
153 days ago
|
|
There was an article on HN last week (?) which described this exact behaviour in the newer models. Older, less "capable", models would fail to accomplish a task. Newer models would cheat, and provide a worthless but apparently functional solution. Hopefully someone with a larger context window than myself can recall the article in question. |
|
Purely anecdotally, I've found agents have gotten much better at asking clarifying questions, stating that two requirements are incompatible and asking which one to change, and so on.
https://spectrum.ieee.org/ai-coding-degrades