|
|
|
|
|
by apl
2255 days ago
|
|
> gradient descent no longer has to be written by hand Nobody's been writing derivatives by hand for 5+ years. All major frameworks (PyTorch, Tensorflow, MXNet, autodiff, Chainer, Theano, etc.) have decent to great automatic differentiation. The differences and improvements are more subtle (easy parallelization/vectorization, higher-order gradients, good XLA support). |
|
Automatic differentiation allows for great flexibility and composability but the performance is still far from good, even with the various JITs available. Jax seems to be one of the most flexible and optimized for many use cases for now however.
[1]: https://github.com/salesforce/pytorch-qrnn