| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bigred100 2418 days ago

This is beyond my reach but I’m giving it a stab:

Whatever your computer actually computes is more or less a bunch of pluses, minuses, multiplications, divisions. The function you want to compute may not be differentiable, but the sequence of arithmetic operations you actually compute to arrive at your output should basically be differentiable by the chain rule. Then an AD tool should let you differentiate that.

It seems likely to me that there’s nothing extraordinarily good about this technique without further discussion of the specific problem application given that the field of derivative free optimization exists and is actively researched. I don’t really know why those guys would bother if AD (not a new thing) supplanted their field.

In numerical computing in general I’ve seen quite a bit the idea that you can either differentiate the thing you’re trying to optimize, then use a numerical method to evaluate the derivative, or use a numerical method to evaluate the thing you want to optimize and then differentiate the numerical method. I’m not expert enough to provide any real analysis of when to use each.