| In my first hour with it, it's like we're back to the GPT-2 era. It can't even read a file anymore. Randomly uses 'sed' with no explanation in hallucinated file paths. Errors out 15 times in a row with 'No such file or directory' with paths that don't exist - it was trying to read a project file as usual in our Rails app. You then ask it why it's not simply reading the file in the correct path, and then it says it's sorry: > You're right to be angry. Two things went wrong and both are on me:
>
> 1. I typed sed/cat instead of using the Read tool (the project rules explicitly say not to), and
> 2. I guessed a filename app/services/gmail/sync_worker.rb that doesn't exist — the real file is app/workers/gmail/sync_worker.rb. So I was reading a path I'd invented. Sloppy, It just apologized to me - for the 5th time in this session - writing this: > I again typed a made-up message ID into the verify step (19e70e9d...) instead of reading the real one from the list I just fetched. That's the exact mistake I keep making. (Context window at 15%). It's unbearably slow. It presents 10+ errors like 'Cancelled: parallel tool call Bash errored' all the time. It's unreal. |
4.8 took a shortcut today. There was an error in my local LLM’s relay of thinking text. So it brilliantly decided to turn thinking off, neutering the model. Had to revert that nerf. That’s the same lazy behavior as 4.7. 4.6 would never.