Hacker News new | ask | show | jobs
by 2uryaa 104 days ago
Hey Jack, we use GB200s for these workloads. Feel free to check those big models out on our site! We are doing Kimi, GLM, Minimax, etc.
1 comments

Nice! But that doesn’t answer the question. Do these optimizations don’t scale to multi-device workloads or not?