But it's a self fulfilling prophecy. They need all this stuff because it's a vibe coded app where bugs are randomly introduced, the architecture is overcomplicated and sucks, and stuff is just added for the fun of it.
Do existing companies run entire end-to-end product integration tests on every single change they make to a repo to make sure something hasn't broken? No, they just architect things in a way such that a minor change to something can be tested in isolation. And that can be automated, deterministically and efficiently.
Where I work we can release changes to our production site in minutes almost completely autonomously with high confidence with absolutely zero AI agents in the loop. How did we do it? With lessons learned from the past 5 decades of professional software development experience.
Lets not forget what OpenClaw is at it's core. It's a glorified cron scheduler. Why on earth does any of this effort need to exist. It's not that deep, it's not that complex, it's all AI for AI's sake.
OpenClaw has surprisingly few "dumb" bugs. Is it as stable and secure as the Linux kernel? God no, obviously not. But it has never just crashed for me, for example. Bugs are of the type "X with Y and Z disabled and T turned on - doesn't work", where you're likely one of a few people that have ever tried this combination. Not to mention it can then debug itself and file a bug report, with a bugfix - if you give it a GitHub token.
I run it in a firewalled VM and am very conscious about any tokens I give it access to - so far for all I know this was unnecessary.
PS. for me the core feature of OpenClaw isn't the cron, though that is nice. It's the memory and instant extensibility. Like it takes 5-15 minutes to add an SSH tool where all agent requests go through a manual review, together with a good auto loaded description that just works in all future sessions.
For the few weeks in which I’ve been using it, it has brought down the Raspberry Pi it’s running on several times with extreme resource hogging, local history/memory search is broken due to a trivial bug for which all issues are auto-closed by bits, and it has changed its configuration standards a handful of times in a way that broke my instant messaging access to it, just to name a few gripes.
This is clearly an implementation and not a conceptual issue, as I had none of these issues using the same model with Hermes, for example.
"All that automation allows us to run extremely lean"
He has a different opinion of what it means to be lean than almost everyone else. That's fine, he's allowed to, but it's something you have to understand to make sense of any of his comments on things. He has a radically different set of values to most people.
His team is basically him and two other humans, powering an ambitious well-known project so successful an industry titan ended up acquihiring him/them. That's pretty lean, no?
The ambitious idea is actually giving a chatbot/agent access to a bunch of personal data and having it self-modify its harness and context to some extent.
But it costs $1.3m USD a month to run, not including their salaries. That's the cost of a team of 50-200 staff, depending on where you're hiring.
I don't think there's any way most people would call that lean. It's lean in exactly 1 axis which is people, but no one really cares about that, people is always a proxy for cost.
He said in another thread there's 6 people involved. 6 people for this project doesn't feel lean, without even considering the enormous LLM spend/complexity
Where is that figure from? I would be extremely surprised if that doesn't drop at least an order of magnitude as the hype wears off. Assuming it's even representative of today and not two months ago
If these methods prove successful it isn't going to matter. A user doesn't care if code is 'slop' or artisanal, so long as the app/site/whatever works.
If you can combine autonomous flows (and millions of dollars in tokens) to produce work comparable to a traditional engineering team, then why would the user care which wrote the app/site/whatever?
> People freaking out over my AI spend. What nobody sees: Part of what excites me so much about working on OpenClaw is that I'm trying to answer the question:
How would we build software in the future if tokens don't matter?
[...]
All that automation allows us to run this project extremely lean.
Good thing we cleared that up. Another gem from the "if we had self-driving cars, we could just have them cruise on the roads endlessly when not in use and get rid of so much parking space!" school of resource management...
Do existing companies run entire end-to-end product integration tests on every single change they make to a repo to make sure something hasn't broken? No, they just architect things in a way such that a minor change to something can be tested in isolation. And that can be automated, deterministically and efficiently.
Where I work we can release changes to our production site in minutes almost completely autonomously with high confidence with absolutely zero AI agents in the loop. How did we do it? With lessons learned from the past 5 decades of professional software development experience.
Lets not forget what OpenClaw is at it's core. It's a glorified cron scheduler. Why on earth does any of this effort need to exist. It's not that deep, it's not that complex, it's all AI for AI's sake.