Hacker News new | ask | show | jobs
by SoerenL 2776 days ago
That's what is usually done, autodiff on the component wise expression. We don't do it here. Instead, we really compute on the matrix and tensor level and compute derivatives here directly. Let me give you a simple example to illustrate it: Consider the function f(x)=x'Ax. Then, its Hessian is A+A'. This expression is what we compute. And evaluating this expression is orders of magnitude faster than what TF, PyTorch, etc. do, namely autodiff on the components. Hope this helps, if not, please let me know.
1 comments

Do you think this could end up being implemented in TF and Pytorch?
Not in its current formulation. It uses a different representation of the tensors. However, a new version/algorithm that will be available in a few months can be used in TF and PyTorch.
Are you referring to the Ricci notation when you are saying it uses a different representation of the tensors? Do you also plan to add non-differentiable functions like relu?
Yeah, I am referring to Ricci notation. TF and PyTorch don't use it. The new version already has relus. This version is targeted at standard formulas (has abs as a non-differentiable function), next version works for deep learning. Never wanted to work on this but there is just no way around it.