| > Its a paradigma shift for the whole world literaly. That's hyperbolic. I use LLMs daily. They speed up tasks you'd normally use Google for and can extrapolate existing code into other languages. They boost productivity for professionals, but it's not like the discovery of the steam engine or electricity. > And what limitations are obvious? Tell me? We have not reached any real ceiling yet. Scaling parameters is the most obvious limitation of the current LLM architecture (transformers). That’s why what should have been called GPT-5 is instead named GPT 4.5, it isn’t significantly better than the previous model despite having far more parameters, a lot more cleaned up training data and optimizations. The low-hanging fruit has already been picked, and most obvious optimizations have been implemented. As a result, almost all leading LLM companies are now operating at a similar level. There hasn’t been a real breakthrough in over two years. And the last huge architectural breakthrough was in 2017 (with paper "Attention is all you need"). Scaling at this point yields only diminishing returns. So no, what you’re saying isn’t accurate, the ceiling is clearly visible now. |
completly disagree. People might have googled before but the human<>computer interface was never in any way as accessable as it is now for a normal human being. Can i use Photoshop? yes but i learned it. My sisters played around with Dall-E and are now able to do simiiliar things.
It might feel boring to you that technology accessability drips down like this, but this changes a lot for a lot of people. The entry barrier to everything got a lot lower. It makes a huge difference to you as a human being if you have rich parents and good teachers or not. You had never the chance to just get help like this. Millions of kids struggle because they don't have parents they can ask certain questions required for understanding topics in school.
Steam Engine = fundamental for our scaling economy electricity = fundamental for liberating all of us from day time internet = interconnecting all of us LLM/ML/AI = liberating knowledge through accessability
> 'There hasn’t been a real breakthrough in over two years.' DeepSeek alone was a real breakthrough.
But let me ask an LLM about this:
- Mixture of Experts (MoE) scaling
- Long-context handling
- Multimodal capabilities
- Tool use & agentic reasoning
Funny enough your comment comes before claude 4.0 release (again increase in performance, etc.) and the Google IO.
We don't know if we found all 'low hanging fruits'. The meta paper about thinking in latent space came out in February. I would definitly call this a low hanging fruit.
We are limited, very hard, on infrastructure. Every experiement you want to try consumes a lot of it. If you look at the top x GPU AI clusters, we don't have that many on the planet. We have Google, Microsoft, Azure, Nvidia, Baidu, Tesla and xAI, Cerebras. Not that many researcher are able to just work on this.
Google has now its first Diffusion based Model active. 2025! We are so far away from testing out more and more approaches, architectures etc. And we are optimizing on every front. Cost, speed, precision etc.