Hacker News new | ask | show | jobs
by misja111 59 days ago
I'm actually seeing a similar thing when comparing 4.6 and 4.5. It burns a lot more tokens, does show more how it is thinking along the way, but I don't see a strong difference in the end result. Occasionally 4.6 even seems to get stuck in its 'processing' phase, while 4.5 doesn't on the same task.