Automatic differentiation is something that you get for free if you use dual numbers. Lie theory describes the relationship between the discrete and continuous spaces. Probability has this deep connection to Lie groups.
To give you some intuition (and I'm really rephrasing the stackexchange post above), the only way you can only measure randomness (or generate randomness) is if each draw has a "reference" to some global object that has the global, normalized view of the discrete space.
Are you familiar with Vovk's foundations of probability? That brings you from probability to game semantics. Duality is right next to it.
> Automatic differentiation is something that you get for free if you use dual numbers. Lie theory describes the relationship between the discrete and continuous spaces. Probability has this deep connection to Lie groups.
This...doesn't follow. Probability has a connection to Lie groups because it's fundamentally analytic ("continuous"). But you haven't explained how you make the connection to the dual numbers.
What you're showing here is that a lot of things in mathematics can be described analytically (and saying that would be likewise pretty superfluous). But just because you're working with continuous spaces doesn't mean you've engaged the duals. It generally means you're using the reals.
This gets to the heart of what I'm saying - if I wanted to be flippant I could have said the real numbers, or continuity, or analysis, etc are at the heart of so many distinct subfields of mathematics. It doesn't mean quite a lot.
Duality features in a lot of different parts of mathematics, but that doesn't mean you can productively draw connections between dual things in one area and dual things in another. I'm not seeing how you get from dual numbers to Lie groups.
> Probability has a connection to Lie groups because it's fundamentally analytic ("continuous"). But you haven't explained how you make the connection to the dual numbers.
Are you familiar with Chu spaces?
> But just because you're working with continuous spaces doesn't mean you've engaged the duals. It generally means you're using the reals.
They are not just continous spaces, it's a pair of a discrete space and a continous space that are directly connected. You never work only with one of them at once. You manipulate things in smooth space to solve things in the discrete space and vice versa.
> This gets to the heart of what I'm saying - if I wanted to be flippant I could have said the real numbers, or continuity, or analysis, etc are at the heart of so many distinct subfields of mathematics. It doesn't mean quite a lot.
Analysis is too general and also much higher conceptually. Also, you need to be looking at constructive mathematics to really capture duality. Also analysis is unusuable for a lot of problems that duality is useful for.
For example the Rust borrow checker is based on linear logic, a logic that reifies the concept of duality. No one has ever used analysis to build a compiler.
Forward AD is the pushforward of a tangent vector (an element of the tangent space), Reverse AD is a pullback of a cotangent vector (an element of the cotangent space). The duality notion between tangent and cotangent spaces is the same as the duality notion of spaces in optimization. Unfortunately, I'm only passingly familiar with discrete optimization, but I would suspect the notion extends from optimization. That's not to say that they are fundamentally the same or that writing this down helps anybody in any way, but a lot of these "dual" notions do have some sort of dual vector space under the hood.
Yeah, but all you're really describing here is linear algebra. Vector spaces and linearity are a significant part of every single discipline the grandparent commenter mentioned, but they picked out duality.
I would agree with the critique: I don't think highlighting duality here is particularly useful. For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces. It's just a very general semantic concept that describes pairs of things. But it doesn't say that any given pair of dual things is related to another pair of dual things.
> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.
They don't. Because certain operations are hard to reason about in linear spaces. Such as optimization.
Don't get me wrong, I'm not shitting on vector spaces. All I'm saying is that some problems are hard to do in vector spaces, that are easy in the smooth spaces and vice versa. Like having these two APIs to the same space much more powerful, because again, you generalize over the conversions between the two spaces. You use whichever API is more appropriate in the particular context.
In some sense the linear spaces deal with things like infinity, the smooth spaces deal with cyclical things (signals, wavelets, modular arithmetic).
Quite a bit of optimization is easy to reason about in linear algebra. Take linear and mixed integer programming, for example. And convex optimization subsumes linear optimization in general. There is a lot of nonlinear optimization, but I can assure you with extremely high confidence that the common thread you're seeing here isn't duality, but more abstractly linearity.
Likewise cyclic things show up all the time in purely algebraic (read: discrete, non-smooth) contexts. We have that in vector spaces, group theory, rings, modules, etc.
> For example, the way dual numbers are used to extend the reals for automatic differentiation doesn't have a deep connection to duality in vector spaces.
Yes, the right way to think about dual numbers (esp once you generalize them beyond just the single e^2=0), is to think of them as tangent vectors (sections of the tangent bundle). I've never really liked the "dual number" terminology here. That's why I deliberately chose to use the duality of forward and reverse mode AD, because that notion of duality agrees with the underlying linear algebra (or in general differential geometry). I do agree it's a mess of terminology.
Gimme five and I'll answer two. There's quite a few pairwise permutations and some are easier to understand and more instructive than others.
Fundamentally, they are both connected via the idea of convex optimization. Automatic differentiation is a computational technique to solve optimization problems.
Yes optimization problems is very general however calculus is a fundamental tool. Dual numbers are somewhat like lie groups, very smooth and conducive to optimization.
Curious, can you expand on the connection to convex optimization? To my understanding, discrete optimization is nonconvex by nature due to discontinuities in the feasible space.
There are two types of spaces, discrete and continuous. These are in a dual relationship.
Duality is the isomorphism between these two. For example, for humans, it's easier to reason about discrete spaces. However a lot of things simply cannot be done that way.
Think of anything that is tangential (pun intended) to Lie theory. In the context of Lie theory, you have the discrete group and the continuous algebra (the group's tangent space). You go between these two using the exponent (group -> algebra) and logarithm to go back (algebra -> group).
It's the difference between an integral and a Rieman sum. It's the fundamental idea that underlies sampling (say audio sampling or even statistical sampling). You capture some invariants and then you interpolate between these invariants to recreate some smooth curve (or distribution).
The nice thing about the smooth space is that optimization is easy. In the exponential space, addition is multiplication and some expensive things are cheap (computationally speaking).
Unless I'm severely misunderstanding you, a discrete set (or function) cannot be a dual of a continuous set (or function). If nothing else, the former is countable and the latter is uncountable; there can be no isomorphism between the two.
I’m not sure. I’m not entirely convinced that discrete and continuous spaces are dual spaces. They are connected, but they are not duals.
Same with sampling vs continuous. One cannot interchange the order of the composing morphisms while preserving the properties of the original. The sampled object cannot reconstruct the continuous object in all situations due to effects like aliasing.
In optimization, the concept of duality is also a much stronger idea: the primal and the dual of a problem are opposing views of the same problem that correspond exactly (not approximately) in their dual properties.
Discrete optimization is nonconvex by nature (does not satisfy convexity definitions) so I’m not sure if it has any duality relations to convex optimization. There is a relationship but it is not a dual relationship.