| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by loherj 346 days ago
	Yes. If you look at the diagram that plots the performance vs the amount of output tokens, you can see that R1T2 uses about 1/3 of the output tokens that R1-0528 uses. Keep in mind, the speed improvement doesn’t come from the model running any faster (it’s the exact same architecture as R1, after all) but from using less output tokens while still achieving very good results.