|
One thing that many mathematicians today don’t think about is how deeply intertwined the field has historically been with theology. This goes back to the Pythagoreans at least. That survives in the culture of mathematics where we continue to see a high regard for truth, beauty, and goodness. Which, incidentally, are directly related to logic, aesthetics, and ethics. The value of truth in a proof is most obvious. The value of aesthetics is harder to explain, but there's no denying that it is in fact observably valued by mathematicians. As for ethics, remember that human morality is a proper subset thereof. Ethics concerns itself with what is good. It may feel like a stretch, but it's perfectly reasonable to say that for two equally true proofs of the same thing, the one that is more beautiful is also more good. Also, obviously, given two equally beautiful proofs, if only one is true then it is also more good. |
As a non-mathematician, I've noticed this as well, and I have a suspicion the historical "culture" is holding the field back. Gödel proved there are an infinite number of true arithmetic statements unprovable within any (consistent, sufficiently powerful) formal system. But our "gold standard" formal system, ZFC, has about as many axioms as we have fingers — why is finding more axioms not the absolute highest priority of the field?
We struggle to prove facts about Turing machines with only six states, and it's not obvious to me that ZFC is even capable of resolving all questions about the behavior of six state Turing machines (well, specifically just ZF, as C has no bearing on these questions).
Yet Turing machines are about as far from abstract mathematics as one can get, because you can actually build these things in our physical universe and observe their behavior over time (except for the whole "infinite tape" part). If we can't predict the behavior of the majority of tiny, deterministic systems with ZFC, what does that say about our ability to understand and predict real world data, particularly considering that this data likely has an underlying algorithmic structure vastly more complex than that of a six state Turing machine?
More formally, my complaint with the culture of mathematics is:
1) We know that for any string of data, I(data : ZFC) ≤ min(K(data), K(ZFC)) + O(1)
2) K(ZFC) is likely no more than a few bytes. I think the best current upper bound is the description length of a Turing machine with a few hundred states, but I suspect the true value of K(ZFC) is far lower than that
3) Thus K(data) - K(data | ZFC) ≤ "a few bytes"
Consider the massive amounts of data that we collect to train LLMs. The totality of modern mathematics can provide no more than a few bytes of insight into the "essence" of this data (i.e., the maximally compressed version of the data). Which directly translates to limited predictability of the data via Solomonoff induction. And that's in principle — this doesn't even consider the amount of time involved. If we want to do better, we need more axioms, full stop.
One might counter, "well sure, but mathematicians don't necessarily care about real world problems". Ok, just apply the same argument to the set of all arithmetic truths. Or the set of unprovable statements in the language of a formal system (that are true within some model). That's some interesting data. Surely ZFC can discover most "deep" mathematical truths? Not very likely. The deeper truths tend to occur at higher levels of the arithmetic hierarchy. The higher in the hierarchy, the more interesting the statement. And these are tiny statements too: ∀x ∃y ∀z [...]. Well we're already in trouble because ZFC can only decide a small fraction of the Π_2 statements that can fit on a napkin and it drops off very quickly at higher levels than that. Again, we need more axioms.