|
|
|
|
|
by andy_xor_andrew
748 days ago
|
|
When reading Hacker News you develop a signal/noise filter, where lots of headlines make bold claims but you filter them out as embellishment or exaggeration. My bullshit detector went off when I first saw Groq posted on HN - a startup is making their own chips (doubt) that performs faster than anything Nvidia has for inference (doubt) and accelerates LLMs to hundreds/thousands of tokens per second?? Mega doubt. But... then I tried their demo, and... yeah, it's that good. Such an amazing company of talented individuals. |
|
The other issue they don't mention is power, space, efficiency etc. We want to run larger models with less power, fewer server blades, at lower cost. Not use more server blades, more chips, more power, etc.