Y
Hacker News
new
|
ask
|
show
|
jobs
by
zargon
42 days ago
They're using the term speculative decoding but doing MTP. It's the same thing as Nemotron, but Google removed the MTP heads from the original safetensora release. (They were not removed from the LiteRM format.)