Y
Hacker News
new
|
ask
|
show
|
jobs
by
7777777phil
117 days ago
Cool hack but 0.5 tok/s on 70B when a 7B does 30+ on the same card. NVIDIA's own research says 40-70% of agentic tasks could run on sub-10B models and the quality gap has closed fast.