Hacker News new | ask | show | jobs
by js8 33 days ago
I think we're close to the plateau of what LLMs can do, but they will keep improving. IMHO the results are already showing diminishing returns.

The (leading) LLMs work by consensus, like Wikipedia, Openstreetmap, web search engine or opensource movement.

What I mean is if I ask LLM "create a linked list", its understanding (of what I want) is already close to the expected ideal. Just like Wikipedia article on linked list, for example.

But the LLMs will continue to improve in breath and depth of understanding the world, although technically (what they CAN do) they probably already peaked. Similarly, OSS movement technically peaked in the 90s with the creation of compiler, operating system and a database; doesn't mean that new opensource isn't being created.

1 comments

There is so much money at stake, and so much money pouring into AI development, that I think we are going to continue to see gains for a while. People keep coming up with new agent harness techniques like chain of thought, tool calling, and memories. And then the big LLM companies figure out how to actually train their models to optimize the use of those techniques. To claim that we are reaching the top of the plateau is to claim that we are out of effective ideas for improvement. I think that's a ridiculous claim, the technology is too new. And because of the strong incentives to keep making these things better, it's pretty much a given that people will continue to explore ideas until we really are out of effective ideas. I don't think anyone apart from professional AI researchers have any idea where this is all going to settle.
Well depends what you mean by peak. I was answering parent's question of what LLM's CAN do. It's not about peak of technology or humanity itself.

LLMs (or specifically GPT algorithm) are 8 years old. It has matured as a technology. I am not sure how you imagine it being significantly improved, from a user point of view, without some kind of paradigm shift (i.e. something significantly different from GPT or LLM).

Although I can imagine one important social innovation yet to come - a generally available big public LLM, that "anybody can train". We had a technology of "encyclopedia" for years (famously Brittanica); yet the concept of Wikipedia has been a truly new take on encyclopedia.

Also, new kinds of AI might emerge - for example we might formalize all types of human reasoning and build a reasoning AI, as well a model of human language, from scratch rather by training via GPT (and thus, more understandable and potentially smaller). But that won't be an LLM.

One major axis on which LLMs could get better is energy (and in general material) efficiency, doing the same stuff they do now but with fewer inputs. I actually fear that we are very early on this curve. The period of time between electric arc lamps and the current state of affairs where electric light is almost free was more than a hundred years. Lots and lots of investment is taking place under the assumption that LLMs will ride that curve lots lots faster. If it doesn't -- or if there is some physical law which means we're already close against some asymptotic limit -- then we're talking about a generational misinvestment, and that's only one of the underlying but potentially false assumptions of the AI investment boom.
> I am not sure how you imagine it being significantly improved, from a user point of view, without some kind of paradigm shift

I proposed how. New harness techniques and new training data/techniques, so the harness gets better and the LLM can be trained to work better with the harness. There's no reason to believe we're out of momentum for improvement in that direction.

Yeah but what do you mean by (substantially) better in this context, what is the outcome? Modern models can understand the requirements as well as humans can.

However, they also make mistakes like humans, I don't think a better harness or better training will fix that, because fundamentally, they cannot read your mind, if you put in an ambiguous prompt.

I like to compare the process of turning inexact text to formal language to an error-correcting code. If you haven't made too much mistakes or have been precise in the specification, it will self-correct and do what you want. But if your input is too ambiguous, it will never do exactly what you want, but something close to it. And people (who are using AI) are still learning where is the boundary and how to tell.

The companies building these models are training them to react to typical expectations. If you have some special need, you will always have to tell the model, otherwise it will not know your exact context. And the harnesses have many tools for that or try to do that automatically already.