Hacker News new | ask | show | jobs
by pdksam 1150 days ago
If anything performance should get better with time
2 comments

What I mean is resources will be limited or models that are slightly worse will be released that will be much more cost effective but not quite as good.

This is often the case with these types of technologies.

So far, with technologies, it's been that new tech is both cheaper and better than the previous one.

To not look far - gpt3.5 turbo.

Again, not what I'm saying.
Why do you think that would be the case?
Moore's law-ish like optimization.

You Z80 computer cost $700 in the lat 70's...they're now in sub-$1 embedded controllers.

But what is being optimized? Hardware sure isn't getting faster in a hurry, and I don't see anything on the horizon that will aid in optimizing software.
The various open source LLMs are doing things like reducing bits-per-parameter to reduce hardware requirements; if they're using COTS hardware it almost certainly isn't optimised for their specific models; Moore's Law is pretty heavily reinterpreted, so although we normally care about "operations per second at a fixed number of monies" what matters here is "joules per operation" which can improve a by a huge margin even before human level, which itself appears to be a long way from the limits of the laws of physics; and even if we were near the end of Moore's Law and there was only a 10% total improvement available, that's 10% of a big number.
Moore's law was an effect that stemmed from the locally exponential efficiency increase from designing computers using computers, each iteration growing more powerful and capable of designing still more powerful hardware.

10% here and there is very small compared to the literal orders magnitude improvements during the reign of Moore's Law.

I don't really see anything like that here.

> 10% here and there is very small compared to the literal orders magnitude improvements during the reign of Moore's Law.

I can't confirm it, but I noticed this comment says "gpu tech has beat Moore’s law for DNNs the last several years":

https://news.ycombinator.com/item?id=35653231

> 10% here and there is very small compared to the literal orders magnitude improvements during the reign of Moore's Law.

Missing the point, despite being internally correct: 10% of $700k/day is still $25M/y.

If you'd instead looked at my point about energy cost per operation, there's room for something like 46,000 improvement just to human level, and 5.3e9 to the Landauer limit.

There are a few avenues. Further specialization of hardware around LLMs, better quantization (3 bits/p seems promising), improved attention mechanisms, use of distilled models for common prompts, etc.
This would be optimizations, which is not really the same thing as moore's law-like growth which was absolutely mind-boggling, like it's hard to even wrap your head around how fast tech was moving in that period since humans don't really grok exponentials too well, we just think they look like second degree polynomials.
Probabilistic computing offers the potential of a return to that pace of progress. We spend a lot of silicon on squashing things to 0/1 with error correction, but using analog voltages to carry information and relying on parameter redundancy for error correction could lead to much greater efficiency both in terms of OPS/mm^2 and OPS/watt.
> Hardware sure isn't getting faster in a hurry

How is it not?

These LLMs were recently trained using NVidia A100 GPUs.

Now NVidia has H100 GPUs.

The H100 is up to nine times faster for AI training and 30 times faster for inference than the A100.

Not soon but all the major players are making even more AI specialized silicon.