Hacker News new | ask | show | jobs
by Lindon4290 722 days ago
Yeah, the rumours(?) are a groq system required to produce 300+ t/s on a Llama-2 70B (bs=1) requires 576 chips (9 racks) [1]

So, that's like $10M+ for serving bs=1 Llama-2 70B vs whatever a single MI300X costs?

[1] https://twitter.com/swyx/status/1759759125314146699

1 comments

The exact cost of a mi300x is closely guarded by amd. I buy them and do not know how much they are. That said a whole chassis of 8x is far far far less than 10m.
You should make a whole post about this! Like how a single MI300X outperforms groq at bs=1.

300 tokens/s with bs=1 for a llama-2 70B on a single card is no joke.

This is why I sponsored doing the chipsandcheese tests on my hardware. That instigated Elio to up the game even further.

All open source by the way.

Thank you for sponsoring this. There's so little buzz about this hardware despite the fact it's clearly amazing for AI use cases. I don't understand why not. Maybe this is why Nvidia is the most valuable company in the world - nobody can be bothered to try a competitor.