Hacker News new | ask | show | jobs
Cerebras Inference now runs Llama 3.1-70B at 2100 tokens/s (cerebras.ai)
6 points by cs-fan-101 600 days ago