| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ogrisel 3562 days ago
	Google never stated they use those to train models as far as I know. It seems that they are primarily used to spare energy when deploying trained models at scale.

1 comments

Houshalter 3562 days ago

Theres no reason they couldn't use them to train, as long as they can account for the lower precision operations. I think it would be much cheaper to train on them, at that scale anyway.

link

dharma1 3562 days ago

Afaik the Google TPU does inference only, at 8 bits. I don't think it's possible to train a neural network at 8 bit precision at this point in time. FP16 works for training though, and is twice as fast as FP32 on certain nvidia chips

link

Houshalter 3562 days ago

Backpropagation can work with any precision, as long as you use stochastic rounding (so that the rounding errors are not correlated.) Without stochastic rounding even 16 bits will have rounding error bias.

http://arxiv.org/abs/1412.7024

link

dharma1 3562 days ago

OK. I was going by this - https://petewarden.com/2016/05/03/how-to-quantize-neural-net...

I haven't seen 8bit training implemented in any (public) frameworks yet - that's not to say it's not possible. If it works then that's great, especially for specialised hardware.

link