|
|
|
|
|
by rgmerk
5 days ago
|
|
It put the chart title directly on top of Australia. Which just about sums up my experience with using LLMs to code, really (though not with these state-of-the-art models, admittedly) - it's amazing what they can do, but left to their own devices they'll make boneheaded decisions. |
|
Yeah, the whole "can run for 9 hours on a task" to me is not a positive.
I tend to find if Opus 4.8 runs for ~15 mins on a task, then the end result has gone off in a weird direction at some point, and it needs winding back a fair bit.
And that's with extremely clear direction, literal specification docs to follow, etc.
That being said, having functional code already created beforehand (ie by a human) goes a long way to ensuring the AI model has a path it can build on without making too many dumb architectural choices by itself. Generally.