| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ml_hardware 2088 days ago
	Totally agree! I think 3090 could be a lot more cost effective for researchers to dabble with NLP. But it really grinds my gears when people post these misleading benchmarks... the 3090 is handicapped at half-rate tensor core performance while the Titan RTX is not. So if you're someone who does their work mainly in FP32, you will see improved performance with the 3090. On the other hand, if you are an FP16 speed demon who needs to train GPT-3 over the weekend, stick with your Titans :)

1 comments

bitL 2088 days ago

What do you think about TF32 in 3090? Could it replace FP32 with 5x speedup?

link

ml_hardware 2088 days ago

I've done a lot of work in ML numerics, and I think TF32 is a completely safe drop-in for FP32 for ML workloads. NVIDIA seems to think so too, which is why on A100 it won't even be an option, it will be the default mode for any FP32 matrix multiplies.

But on 3090, I don't think the speedup will be 5x, it should be closer to like 2x. The 3090 has 35.6 TF/s at TF32 and the Titan RTX has 16.3 TF/s at FP32. Once again I think there is handicapping going on for 3090.

link

bitL 2088 days ago

So basically no difference to FP32. That sounds very handicapped.

link