| > Also, don't forget the Jacobian and gradient aren't the same thing! Every gradient is a Jacobian but not every Jacobian is a gradient. If you have a map f from R^n to R^m then the Jacobian at a point x is an m x n matrix which linearly approximates f at x.
If m = 1 (namely if f is a scalar function) then the Jacobian is exactly the gradient. If you already know about gradients (e.g. from physics or ML) and can't quite wrap your head around the Jacobian, the following might help (it's how I first got to understand Jacobians better): 1. write your function f from R^n to R^m as m scalar functions f_1, ..., f_m, namely f(x) = (f_1(x), ..., f_m(x)) 2. take the gradient of f_i for each i 3. make an m x n matrix where the i-th row is the gradient of f_i The matrix you build in step 3 is precisely the Jacobian. This is obvious if you know the definition and it's not a mathematically remarkable fact but for me at least it was useful to demystify the whole thing. |