| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ggnore7452 851 days ago
	The Groq demo was indeed impressive. I work with LLM alot in work, and a generation speed of 500+ tokens/s would definitely change how we use these products. (Especially considering it's an early-stage product) But the "completely novel silicon architecture" and the "self-developed LPU" (claiming not to use GPUs)... makes me bit skeptical. After all, pure speed might be achievable through stacking computational power and model quantization. Shouldn't innovation at the GPU level be quite challenging, especially to achieve such groundbreaking speeds?

3 comments

avivweinstein 851 days ago

I work at Groq. We arent using GPUs at all. This is a novel hardware architecture of ours that allows this high throughput and latency. Nothing sketchy about it.

link

Jensson 851 days ago

> Shouldn't innovation at the GPU level be quite challenging, especially to achieve such groundbreaking speeds?

GPUs are general purpose, a for purpose built chip that is better isn't that hard to make at all. Google didn't have to work hard at all to invent TPUs which is that idea as well, they said their first tests proved the idea worked so it didn't require anything near Nvidias scale or expertise.

link

ggnore7452 851 days ago

more on the LPU and data center: https://wow.groq.com/lpu-inference-engine/

price and speed benchmark: https://wow.groq.com/

link