| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shoyer 1840 days ago

The general case of implicit differentiation, i.e., for functions y(x) defined by the constraints F(x, y(x)) = 0, where x and y are vectors, is solved by "implicit function theorem": https://en.wikipedia.org/wiki/Implicit_function_theorem

∂ y(x) = -(∂_1 F(x, y(x)))^{-1} (∂_0 F(x, y(x)))

where ∂ denotes partial differentiation.

This turns out to be an incredibly useful identity for calculating derivatives. No matter how you calculated a solution to the equation, computing derivatives is "just" a matter of performing a linear solve.

If the calculation you performed is a solution to solving an equation, implicit differentiation is typically much faster, less memory intensive and more accurate than calculating derivatives by differentiating through your solver. For examples, you might check-out a recent paper I co-authored with colleagues at Google: https://arxiv.org/abs/2105.15183