| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pyentropy 54 days ago
	The number of tokens you predict at time (multi or not) has nothing to do with whether the model wants to emit any, some or a lot of reasoning tokens in reasoning tag -- similar to how branch prediction will not really change the for loop iteration count.

1 comments

sometimelurker 54 days ago

no it might. a high reasoning task is probably harder than a low reasoning task, so the same MTP LLM will predict more correct tokens on the low reasoning task. to compensate for this, big labs likely have different MTP LLMs for different cases. it would make sense for them to do this

link