Hacker News new | ask | show | jobs
by mcabbott 1615 days ago
Thanks for the references. We're trying to write up a paper about all this work, and (to us) it seems obvious that differential geometry is the right framework.

What reverse mode AD talks informally about as "gradients" are cotangents, and if you are constrained to a submanifold of another manifold (such as the space of all symmetric matrices, embedded in the space of all NxN matrices) then your cotangent is a projection of that in the larger space. This reduces to the obvious thing for symmetric matrices, as you say, but there are less obvious AD-relevant cases (like the space of orthogonal matrices, or the space of "ranges", vectors with uniformly spaced elements) where getting it right with your bare hands is tricky.

I don't see any of the relevant terminology in these links, but they do seem to be thinking about related problems. Perhaps they have re-invented the wheel? Amused to read that standard cookbooks seem to have faithfully reproduced the recipe for a square wheel, for decades.