Y
Hacker News
new
|
ask
|
show
|
jobs
by
astrange
245 days ago
It's partly because floating point math is not associative and GPU inference doesn't guarantee all the steps will be done in the same order.