|
|
|
|
|
by bnchrch
4 days ago
|
|
An 11% jump over opus 4.8 and a 22% jump over gpt 5.5 on Agentic Coding Benchmarks is certainly impressive. Obviously still need to verify it for myself to see if it's truely a leap. But am I the only one wondering, "What can I do today that I couldnt do yesterday?" Previously I would think "Oh I wonder if I can finally get it to do X now?" However now I feel like yesterdays models were more that capable to handle nearly any engineering task I paired with it on. Maybe this is the final leap where I can comfortable set up an autonomous coding loop? Maybe. |
|