|
|
|
|
|
by kahnclusions
126 days ago
|
|
What? Yes they do take shortcuts and hacks. They change the tests case to make it pass. As the context gets longer it is less reliable at following earlier instructions. I literally had Claude hallucinate nonexistent APIs and then admitted “You caught me! I didn’t actually know, let me do a web search” and then after the web search it still mixes deprecated patterns and APIs against instructions. I’m much more worried about the reliability of software produced by LLMs. |
|