|
|
|
|
|
by dbcurtis
3190 days ago
|
|
In the context of graphics processing that trade-off totally makes sense. Thanks for doing the homework that I was too lazy to do :) It seems to me that in the context of NN computations, using the lack of gradual underflow as a non-linear element is going to severely limit the dynamic range of the neurons. On the plus side, the non-linear element is a computational freebie. But in addition to limited dynamic range, it makes the NN ridiculously non-portable across hardware implementations. |
|
I had to give Jakob custom gemm kernels to do this research. Not sure why the denormal point was left out of this blog as it's pretty critical to the whole experiment.