|
|
|
|
|
by Lindon4290
722 days ago
|
|
Going by the results from the article/video, a single MI300X is even outperforming a Groq system [1] The video shows that the optimized run with Llama-2 70B gives 314 tokens/s for a bs=1 with 256 prompt + 256 generation. The Groq system is also a bs=1 apparatus and gets you around 300 tokens/s. Wild! [1] https://wow.groq.com/groq-sets-new-large-language-model-perf... |
|
Groq does not talk about how many cards they need to get those results. Someone replied to me with this comment [1] a while ago...
[0] https://www.reddit.com/r/AMD_MI300/comments/1dqhrbn/comment/...
[1] https://news.ycombinator.com/item?id=39966620