| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Y_Y 809 days ago
	That's not entirely true. Current-gen Nvidia hardware can use fp8 and newly announced Blackwell can do fp4. Lots of existing specialized inference hardware uses int8 and some int4. You're right that low-precison training still doesn't seem to work, presumably because you lose the smoothness required for SGD-type optimization.