Hacker News new | ask | show | jobs
by C-x_C-f 454 days ago
> Also, don't forget the Jacobian and gradient aren't the same thing!

Every gradient is a Jacobian but not every Jacobian is a gradient.

If you have a map f from R^n to R^m then the Jacobian at a point x is an m x n matrix which linearly approximates f at x. If m = 1 (namely if f is a scalar function) then the Jacobian is exactly the gradient.

If you already know about gradients (e.g. from physics or ML) and can't quite wrap your head around the Jacobian, the following might help (it's how I first got to understand Jacobians better):

1. write your function f from R^n to R^m as m scalar functions f_1, ..., f_m, namely f(x) = (f_1(x), ..., f_m(x))

2. take the gradient of f_i for each i

3. make an m x n matrix where the i-th row is the gradient of f_i

The matrix you build in step 3 is precisely the Jacobian. This is obvious if you know the definition and it's not a mathematically remarkable fact but for me at least it was useful to demystify the whole thing.

1 comments

For m = 1, the gradient is a "vector" (a column vector). The Jacobian is a functional/a linear map (a row vector, dual to a column vector). They're transposes of one another. For m > 1, I would normally just define the Jacobian as a linear map in the usual way and define the gradient to be its transpose. Remember that these are all just definitions at the end of the day and a little bit arbitrary.
I'd say a gradient is usually a covector / one-form. It's a map from vector directions to a scalar change. ie. df = f_x dx + f_y dy is what you can actually compute without a metric; it's in T*M, not TM. If you have a direction vector (e.g. 2 d/dx), you can get from there to a scalar.
I'm not a big Riemannian geometry buff, but I took a look at the definition in Do Carmo's book and it appears that "grad f" actually lies in TM, consistent with what I said above. Would love to learn more if I've got this mixed up.

This would be nice, because it would generalize the "gradient" from vector calculus, which is clearly and unambiguously a vector.

It's probably just a notation/definition issue. I'm not sure if "grad f" is 100% consistently defined

I'm a simple-minded physicist. I just know if you apply the same coordinate transformation to the gradient and to the displacement vector, you get the wrong answer.

My usual reference is Schutz's Geometrical Methods of Mathematical Physics, and he defines the gradient as df, but other sources call that the "differential" and say the gradient is what you get if you use the metric to raise the indices of df.

But that raised-index gradient (i.e. g(df)), is weird and non-physical. It doesn't behave properly under coordinate transformations. So I'm not sure why folks use that definition.

You can see difference by looking at the differential in polar coordinates. If you have f=x+y, then df=dx+dy=(cos th + sin th)dr + r(cos th - sin th)d th. If you pretend this is instead a vector and transform it, you'd get "df"=(cos th + sin th)dr + (1/r)(cos th - sin th)d th, which just gives the wrong answer.

To be specific, if v=(1,1) in cartesian (ex,ey), then df(v)=2. But (1,1) in cartesian is (1,1/r) in polar (er, etheta). The "proper" df still gives 2, but the "weird metric one" gives 1+1/r^2, since you get the 1/r factor twice, instead of a 1/r and a balancing r.

And I'm just a simple applied mathematician. For me, the gradient is the vector that points in the direction of steepest increase of a scalar field, and the Jacobian (or indeed, "differential") is the linear map in the Taylor expansion. I'll be curious to take a look at your reference: looks like a good one, and I'm definitely interested in seeing what the physicist's perspective is. Thanks!