Hacker News new | ask | show | jobs
by AbrahamParangi 778 days ago
When I played around with implementing this last night I found using a radial basis function instead of Fourier coefficients (I tried the same, nice and parallel and easy to write) to be more well behaved in training networks of depth greater than 2.