Hacker News new | ask | show | jobs
by DiederikVink 548 days ago
That's a great question, but its hard to get enough insight into how Groq is serving models to properly know what's missing.

If I had to hazard a guess, it would be that their system architecture (# of chips and chip architecture itself) might not be designed for a high concurrency situation.