Hacker News new | ask | show | jobs
by conception 268 days ago
You should try cerebras with qwen. 2000 tokens/sec. It’s like chatting with the future usually- just an instant response.