Hacker News new | ask | show | jobs
by data-ottawa 287 days ago
30-40 at 64k context, but it's a mixture of experts model.

A 70b dense model is slower

Qwen coder 30b Q4 runs 40+.