|
|
|
|
|
by jonas21
12 days ago
|
|
What did your AI-assisted workflow look like 1 year ago? I can only speak for myself, but I would carefully specify a class or module in great detail and then hand it off to the model to implement, then carefully review the result. How about 2 years ago? Back then, I wouldn't even trust it to write a 5-line function without making some sort of silly mistake. Today, I can leave an agent running by itself for 20 or 30 minutes and most of the time, it comes back with a result that's either flawless or can be refined to be good with a few back and forth messages. Maybe I still have to make some high-level decisions ahead of time, but all of the details, including exploring the codebase and figuring out what to do based on that, can be left to the agent. The amount of improvement just in the last 2 years has been staggering. Now extrapolate how things will look if the trend continues for another 2 or 3 years. Is this guaranteed to happen? No. But people have been predicting that we're going to hit a wall for a long time now, and we haven't yet. Maybe there's a wall just ahead of us. But maybe there's not -- and the "not" case seems likely enough that we should at least be planning for it. |
|
I haven't noticed a change in what I trust a model to generate in response to a single prompt in a year. The failure modes are unchanged. Yes, specific failures have improved as they have been documented and passed into model training data, but the way the models fail has not changed. They still fail for me nearly every single day. I'm a pretty heavy user - 3-4 Claude code processes running at a time, all day every day.
What has gotten better is tooling around the model -- but there's no space for exponential growth there. At least, not without exponential cost increase, which would make the whole thing untenable anyway.