| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by teebs 1303 days ago
	It's likely because GPU calculations are non-deterministic and small differences in floating point numbers could lead to different outcomes (either in the way you described or somewhere deeper in the model)

1 comments

> GPU calculations are non-deterministic

Tensorflow is non-deterministic for some operations due to thread scheduling. PyTorch doesn’t have this issue.

Some of the underlying CuDNN algorithms have nondeterministic implementations, which applies to PyTorch as well. See https://pytorch.org/docs/stable/notes/randomness.html