Y
Hacker News
new
|
ask
|
show
|
jobs
by
unglaublich
22 days ago
30tok/s looks fine when you're just streaming code, but the issue is that there's a lot of background noise like tool-calling conventions, metadata, "thinking", etc.