I think it makes sense that GP is skeptical of this article considering it contains things like:
> this tool is improving itself, learning from every interaction
which seem to indicate a fundamental misunderstanding of how modern LLMs work: the 'improving' happens by humans training/refining existing models offline to create new models, and the 'learning' is just filling the context window with more stuff, not enhancement of the actual model or the model 'learning' - it will forget everything if you drop the context and as the context grows it can 'forget' things it previously 'learned'.
When you consider the "tool" as more than just the LLM model, but the stuff wrapped around calling that model then I feel like you can make a good argument it's improving when it keeps context in a file on disk and constantly updates and edits that file as you work throguh the project.
I do this routinely for large initiatives I'm kicking off through Claude Code - it writes a long detailed plan into a file and as we work through the project I have it constantly updating and rewriting that document to add information we have jointly discovered from each bit of the work. That means every time I come back and fire it back up, it's got more information than when it started, which looks a lot more improvement from my perspective.
You're letting Claude do your programming for you, and then sweeping up whatever it does afterwards. Bluntly, you're off-loading your cognition to the machine. If that's fine by you then that's fine enough, it just means that the quality of your work becomes a function of your tooling rather than your capabilities.
I don't agree. The AI largely does the boring and obvious parts. I'm still deciding what gets built and how it is designed, which is the interesting part.
Personally, I spend _more_ time thinking with Claude. I can focus on the design decisions while it does the mechanical work of turning that into code.
Sometimes I give the agent a vague design ("make XYZ configurable") and it implements it the wrong way, so I'll tell it to do it again with more precise instructions ("use a config file instead of a CLI argument"). The best thing is you can tell it after it wrote 500 lines of code and updated all the tests, and its feelings won't be hurt one bit :)
It can be useful as a research tool too, for instance I was porting a library to a new language, and I told the agent to 1) find all the core types and 2) for each type, run a subtask to compare the implementation in each language and write a markdown file that summarizes the differences with some code samples. 20 min later I had a neat collection of reports that I could refer to while designing the API in the new language.
> this tool is improving itself, learning from every interaction
which seem to indicate a fundamental misunderstanding of how modern LLMs work: the 'improving' happens by humans training/refining existing models offline to create new models, and the 'learning' is just filling the context window with more stuff, not enhancement of the actual model or the model 'learning' - it will forget everything if you drop the context and as the context grows it can 'forget' things it previously 'learned'.