| I'm surprised not see see much pushback on your point here, so I'll provide my own. We have an existence proof for intelligence that can improve AI: humans can do this right now. Do you think AI can't reach human-level intelligence? We have an existence proof of human-level intelligence: humans. If you think AI will reach human-level intelligence then recursive self-improvement naturally follows. How could it not? Do you not think human-level intelligence is some kind of natural maximum? Why? That would be strange, no? Even if you think it's some natural maximum for LLMs specifically, why? And why do you think we wouldn't modify architectures as needed to continue to make progress? That's already happening, our LLMs are a long way from the pure text prediction engines of four or five years ago. There is already a degree of recursive improvement going on right now, but with humans still in the loop. AI researchers currently use AI in their jobs, and despite the recent study suggesting AI coding tools don't improve productivity in the circumstances they tested, I suspect AI researchers' productivity is indeed increased through use of these tools. So we're already on the exponential recursive-improvement curve, it's just that it's not exclusively "self" improvement until humans are no longer a necessary part of the loop. On your specific points: > 1. What if increasing intelligence has diminishing returns, making recursive improvement slow? Sure. But this is a point of active debate between "fast take-off" and "slow take-off" scenarios, it's certainly not settled among rationalists which is more plausible, and it's a straw man to suggest they all believe in a fast take-off scenario. But both fast and slow take-off due to recursive self-improvement are still recursive self-imrpovement, so if you only want to criticise the fast take-off view, you should speak more precisely. I find both slow and fast take-off plausible, as the world has seen both periods of fast economic growth through technology, and slower economic growth. It really depends on the details, which brings us to: > 2. LLMs already seem to have hit a wall of diminishing returns This is IMHO false in any meaningful sense. Yes, we have to use more computing power to get improvements without doing any other work. But have you seen METR's metric [1] on AI progress in terms of the (human) duration of task they can complete? This is an exponential curve that has not yet bent, and if anything has accelerated slightly. Do not confuse GPT-5 (or any other incrementally improved model) failing to live up to unreasonable hype for an actual slowing of progress. AI capabilities are continuing to increase - being on an exponential curve often feels unimpressive at any given moment, because the relative rate of progress isn't increasing. This is a fact about our psychology, if we look at actual metrics (that don't have a natural cap like evals that max out at 100%, these are not good for measuring progress in the long-run) we see steady exponential progress. > 3. What if there are several paths to different kinds of intelligence with their own local maxima, in which the AI can easily get stuck after optimizing itself into the wrong type of intelligence? This seems valid. But it seems to me that unless we see METR's curve bend soon, we should not count on this. LLMs have specific flaws, but I think if we are honest with ourselves and not over-weighting the specific silly mistakes they still make, they are on a path toward human-level intelligence in the coming years. I realise that claim will sound ridiculous to some, but I think this is in large part due to people instinctively internalising that everything LLMs can do is not that impressive (it's incredible how quickly expectations adapt), and therefore over-indexing on their remaining weaknesses, despite those weaknesses improving over time as well. If you showed GPT-5 to someone from 2015, they would be telling you this thing is near human intelligence or even more intelligent than the average human. I think we all agree that's not true, but I think that superficially people would think it was if their expectations weren't constantly adapting to the state of the art. > 4. Once AI realizes it can edit itself to be more intelligent, it can also edit its own goals. Why wouldn't it wirehead itself? It might - but do we think it would? I have no idea. Would you wirehead yourself if you could? I think many humans do something like this (drug use, short-form video addiction), and expect AI to have similar issues (and this is one reason it's dangerous) but most of us don't feel this is an adequate replacement for "actually" satisfying our goals, and don't feel inclined to modify our own goals to make it so, if we were able. > Knowing Yudowsky I'm sure there's a long blog post somewhere where all of these are addressed with several million rambling words of theory Uncalled for I think. There are valid arguments against you, and you're pre-emptively dismissing responses to you by vaguely criticising their longness. This comment is longer than yours, and I reject any implication that that weakens anything about it. Your criticisms are three "what ifs" and a (IMHO) falsehood - I don't think you're doing much better than "millions of words of theory without evidence". To the extent that it's true Yudkowsky and co theorised without evidence, I think they deserve cred, as this theorising predated the current AI ramp-up at a time when most would have thought AI anything like what we have now was a distant pipe dream. To the extent that this theorising continues in the present, it's not without evidence - I point you again to METR's unbending exponential curve. Anyway, so I contend your points comprise three "what ifs" and (IMHO) a falsehood. Unless you think "AI can't recursively self-improve itself" already has strong priors in its favour such that strong arguments are needed to shift that view (and I don't think that's the case at all), this is weak. You will need to argue why we should need to have strong evidence to overturn a default "AI can't recursively self-improve" view, when it seems that a) we are already seeing recursive improvement (just not purely "self"-improvement), and that it's very normal for technological advancement to have recursive gains - see e.g. Moore's law or technological contributions to GDP growth generally. Far from a damning example of rationalists thinking sloppily, this particular point seems like one that shows sloppy thinking on the part of the critics. It's at least debateable, which is all it has to be for calling it "the biggest nonsense axion" to be a poor point. [1] https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com... |
Regarding LLM’s I think METR is a decent metric. However you have to consider the cost of achieving each additional hour or day of task horizon. I’m open to correction here, but I would bet that the cost curves are more exponential than the improvement curves. That would be fundamentally unsustainable and point to a limitation of LLM training/architecture for reasoning and world modeling.
Basically I think the focus on recursive self improvement is not really important in the real world. The actual question is how long and how expensive the learning process is. I think the answer is that it will be long and expensive, just like our current world. No doubt having many more intelligent agents will help speed up parts of the loop but there are physical constraints you can’t get past no matter how smart you are.