Hacker News new | ask | show | jobs
by vlovich123 37 days ago
Why is this presumed to be de facto inevitable:

* local compute isn’t scaling as before, so algorithmic improvements are the only ways models get meaningfully faster and smarter

* all those same algorithmic improvements would also be true for larger models

* hardware manufacturers have an incentive against local LLMs because cloud LLMs are so much more lucrative (+ corps would by desktop variants if they were good enough)

So no it’s not clear quality will ever be comparable. It may be good enough for what you want but there will always be a harder problem that you need to throw more compute and more memory at.

1 comments

> It may be good enough for what you want but there will always be a harder problem that you need to throw more compute and more memory at.

Sure, but if the “good enough for what you want” consumes the vast majority of cases - data-center ai becomes just for the very extreme edge cases. Like how I can render a 4k rez video game at 60fps on my home pc, but if pixar wants to render their next movie they use data-center compute.

> all those same algorithmic improvements would also be true for larger models

Smaller models run faster. If ten runs of a small model gets me the same quality result as one run of the big model, and the small model runs 10x faster, then they are functionally the same.

Even accepting the premise, it should be obviously true that 10 dumber models running 10x as fast != 1 smarter model. Otherwise engineering would just be a matter of throwing people at a problem when it’s very clear that 1 talented engineer can outperform a team of engineers or accomplish things the team would never have been able to. There’s also the assumption you’re making that a 10x smaller model is 10x dumber when it’s not - it’s a curve and some people seem to struggle with non linear effects
> it should be obviously true that 10 dumber models running 10x as fast != 1 smarter model

If a smaller model tries ten things and comes to the same conclusion as the big model gets first try, then yeah 10x small = 1x big. Is that where we are at now? Idk probably not - but it’s not hard to imagine something like that emerging soon. There is already evidence that smaller models get some things _better_ than bigger models (e.g. https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag... )

> There’s also the assumption you’re making that a 10x smaller model is 10x dumber when it’s not

That is not an assumption i am making. I said “a smaller model” not “a 10x smaller model”. Model speed and model “intelligence” are both non-linear.

> Like how I can render a 4k rez video game at 60fps on my home pc, but if pixar wants to render their next movie they use data-center compute.

This is a very nice analogy actually and it impacts the whole story about US vs. Chinese leadership in "frontier AI".