Hacker News new | ask | show | jobs
by jules 4622 days ago
Read the article, it's not about computing derivatives of real world data (using finite differences or whatever), it's about exact derivatives of rational functions specified by a computer program. While what you wrote is interesting, none of it applies to this article.
1 comments

^ Totally this. It's not meant for differentiating a function known only by its points. It's meant for evaluating nth-order derivatives of functions you can already evaluate at arbitrary values. The functions don't even have to be rational. As long as the function can be calculated by a computer by some method, so can its derivatives.

This comes in handy in all sorts of things where you might have designed this fancy kernel function to use in some process where you need to be able to calculate the value of the function and also of some derivatives--backpropagation for example [1].

As I was told by someone in the field, at one point, people used to generate machine learning publications simply by finding functions that required fancy mathematical tricks to find closed-form derivatives of chosen functions so that they would be usable in learning algorithms. But in many cases, this work is unnecessary if you use automatic differentiation.

It's a really cool concept, applicable in specific situations. If you need to know the derivative of a function that's not fully specified, you need numerical differentiation [2]. If you need a closed-form expression for your derivative function, that's when you need symbolic differentiation [3].

[1] http://en.wikipedia.org/wiki/Backpropagation [2] http://en.wikipedia.org/wiki/Numerical_differentiation [3] http://en.wikipedia.org/wiki/Symbolic_differentiation

> It's not meant for differentiating a function known only by its points. It's meant for evaluating nth-order derivatives of functions you can already evaluate at arbitrary values

This is correct. So my comment "I think generally a better approach to AD is to redefine your differentiation", doesn't really apply to AD. Essentially, AD is about differentiating functions, not data.