Hacker News new | ask | show | jobs
by throwa356262 7 days ago
Better performance than TQ and better quality than FP16?

Am I reading this right??

3 comments

It's not better quality: 59.3% vs 59.4% fp16 on AIME 25
0.1% is within margin of error. Depending on the performance boost, it might be worthwhile taking a minuscule quality hit.
I think it very much is worth it!

But the point was that quality didn't magically increase.

any divergence (even if the benchmark is better) from full precision is error
Just pretend that it is the next step update when training. You didn’t train your model to step=inf, I hope?
Faster than Fp16, not better quality i guess