Hacker News new | ask | show | jobs
by dhruvdh 763 days ago
I don't know what you are trying to say here. If one system doesn't need to move as much data because it is more flexible, that is a good thing. What do we gain by making it "fair"?
1 comments

If you're limiting the size of the model to 110 million parameters (105MiB assuming int8) because that's what will fit onto your FPGA then of course it's going to be more energy efficient than a Broadwell era Xeon with a 24GB RTX 3090. It's like concluding that a rickshaw is more efficient than a train, something that will absolutely be true in a technical sense if you're only transporting a single passenger, but makes no sense if you're transporting hundreds if not thousands of passengers.

A more apt comparison would have been with a phone made in the past 5 years, even without an AI accelerator chip I'm sure you could manage 20-30+ t/s from a 110m model but this depends entirely on the memory bandwidth of the phone.