|
|
|
|
|
by shoyer
1840 days ago
|
|
The general case of implicit differentiation, i.e., for functions y(x) defined by the constraints F(x, y(x)) = 0, where x and y are vectors, is solved by "implicit function theorem": https://en.wikipedia.org/wiki/Implicit_function_theorem ∂ y(x) = -(∂_1 F(x, y(x)))^{-1} (∂_0 F(x, y(x))) where ∂ denotes partial differentiation. This turns out to be an incredibly useful identity for calculating derivatives. No matter how you calculated a solution to the equation, computing derivatives is "just" a matter of performing a linear solve. If the calculation you performed is a solution to solving an equation, implicit differentiation is typically much faster, less memory intensive and more accurate than calculating derivatives by differentiating through your solver. For examples, you might check-out a recent paper I co-authored with colleagues at Google: https://arxiv.org/abs/2105.15183 |
|