|
|
|
|
|
by shoyer
1973 days ago
|
|
As someone who builds neural networks routinely, this sort of non-reproducibility sounds troubling to me. We expect small differences for floating point arithmetic between platforms, but integer math is typically exact. This is all the more concerning for 8-bit quantized arithmetic, where off-by-one means a relative error of about half a percent. If a individual layers in a quantized neural net have off-by-one errors with a consistent bias, I can imagine these errors accumulating into significant losses in model quality in deep networks. There isn't a huge margin for error in quantized neural nets. One concern about the article: it uses the word "non-deterministic" in a slightly misleading way. I assume any specific hardware is still expected to produce consistent results when run twice on the same input. So it's more non-reproducible than non-deterministic. Compensating for inconsistent arithmetic on different devices sounds much more feasible than compensating for stochastic arithmetic. |
|