Y
Hacker News
new
|
ask
|
show
|
jobs
by
rayboy1995
23 days ago
Thanks!! I had disabled that previously while debugging, I can confirm this is helping accuracy from what I can tell so far. (And speed since the cache is preserved more often!)
1 comments
satvikpendem
23 days ago
Use the MTP models which 2x token generation speed, for example:
https://unsloth.ai/docs/models/qwen3.6#mtp-guide
link
rayboy1995
23 days ago
Very interesting I'll have to check this out thank you. This is why I love HN.
link