| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alpacalaca 812 days ago
	If you're limiting the size of the model to 110 million parameters (105MiB assuming int8) because that's what will fit onto your FPGA then of course it's going to be more energy efficient than a Broadwell era Xeon with a 24GB RTX 3090. It's like concluding that a rickshaw is more efficient than a train, something that will absolutely be true in a technical sense if you're only transporting a single passenger, but makes no sense if you're transporting hundreds if not thousands of passengers. A more apt comparison would have been with a phone made in the past 5 years, even without an AI accelerator chip I'm sure you could manage 20-30+ t/s from a 110m model but this depends entirely on the memory bandwidth of the phone.