I think the more correct statement is that there are different models of ZFC in which BB(748) are different numbers. People find that weird because they don't think about non-standard models, as arguably they shouldn't.
How is that possible? That implies there’s at least one specific program whose execution changes based on the ZFC model. The rules of program execution are so simple, it doesn’t make sense that they’d change based on anything like that.
I don’t get it. Let’s say that BB(748) is 10,000. (I realize the true number is somewhat larger, this is just an example that doesn’t change the argument.) That means there’s one or more Turing machines of that size which run for that many steps. All of the others either run for fewer, or never stop.
Running for fewer steps is extremely well defined and I don’t imagine that enters into this.
That means there’s issue is “never stop”? That also seems pretty well defined to me. For BB(748) to vary based on your model, if the machines that run for fewer steps don’t change, then that means one of the machines that never stops in one model will stop in another. Or the BB winner for our model will never stop in another model.
How can changing your model make it so a specific Turing machine goes from stopping after 10,000 steps to never stopping, or from never stopping to stopping after 11,000 steps?
Yes the issue has to do with "never stops". One of the machines that never stops in one model will stop in another model.
So in one model a Turing Machine called R never stops. In another model R stops after Q steps. But here's the issue... Q isn't an actual natural number, what it is is some mathematical object that satisfies all of the properties of a natural number in ZFC, but is not an actual natural number. What it actually is is some infinitely large object that satisfies all of the Peano axioms of what a natural number is as well as satisfies the following set of rules:
Q > 0
Q > 1
Q > 2
Q > 3
...
Q is basically some infinitely large construct that from within the model appears to be finite, but from outside of the model is not finite.
So within this model, the Turing machine R halts after Q steps, and since from within the model Q is finite then from within this model BB(748) is at least equal to Q.
If BB(748) is actually 10,000, then we can add this as an axiom to ZFC to get a new formal theory ZFC + "BB(748) = 10000".
In this new theory the previous structure that contained Q as an element will not satisfy the definition of a natural number, so we don't have to worry about Q anymore... however, there will exist some number T > 748 where BB(T) is independent of our new theory. For BB(T), there will exist some other model that has its own Q* which satisfies all of our axioms including the axiom that BB(748) = 10000, but also that
What do you mean, Q isn’t a natural number? If you had unlimited time and paper, you could sit down and run the machine by hand, counting each step, until it reaches the halting state. You will have counted Q steps. Or the machine never stops. There’s no such thing as a machine that stops after a number of steps defined by an infinitely large construct. There are machines that stop after some whole number of steps, and there are machines that don’t stop. There are no others.
If there’s another model where this machine doesn’t stop, then that means that at some point during this process, you reach a particular machine state and tape contents and transition to a different state than you did in the first model. That has to happen, because otherwise the execution follows the same process as before, and halts at Q steps. But the mechanics of the machine don’t depend on your theory. They’re just state transitions and tape operations.
Isn't that incompatible with the models being consistent?
Suppose model A proves BB(748) = X and model B proves BB(748) = Y > X. But presumably the models can interpret running all size 748 Turing machines for Y steps. Either one of the machines halts at step Y (forming a proof within A that BB(748) >= Y contradicting the assumed proof within A that BB(748) = X < Y) or none of the machines halts at step Y (forming a proof within B that BB(748) != Y contradicting the assumed proof within B that BB(748) = Y).
I'm guessing the only way this could ever work would be some kind of nastiness like X and Y aren't nailed down integers, so you can't tell if you've reached them or not, and somehow also there's a proof they aren't equal.
The issue is that X and Y are not actual natural numbers. They are mathematical objects that satisfy all the ZFC axioms and Peano arithmetic but are infinitely large. The issue is that ZFC underspecifies natural numbers.
Sure, if someone just gives you the number, ZFC can represent it. But ZFC cannot prove that the value is correct, so how do you know you have the right number? Use a stronger proof system? Go a bit bigger and same issue.
Not an expert, but I've read about this a bit because it bothered me also and I think this is the answer:
Most of these 'uncomputable' problems are uncomputable in the sense of the halting problem: you can write down an algorithm that should compute them, but it might never halt. That's the sense in which BB(x) is uncomputable: you won't know if you're done ever, because you can't distinguish a machine that never halts from one that just hasn't halted yet (since it has an infinite number of states, you can't just wait for a loop).
So presumably the independence of a number from ZFC is like that also: you can't prove it's the value of BB(745) because you won't know if you've proved it; the only way to prove it is essentially to run those Turing machines until they stop and you'll never know if you're done.
I'm guessing that for the very small Turing machines there is not enough structure possible to encode whatever infinitely complex states end up being impossible to deduce halting from, so they end up being Collatz-like and then you can go prove things about them using math. As you add states the possible iteration steps go wild and eventually do stuff that is beyond ZFC to analyze.
So the finite value 745 isn't really where the infinity/uncomputability comes from-it comes from the infinite tape that can produce arbitrarily complex functions. (I wonder if over a certain number of states it becomes possible to encoding a larger Turing machine in the tape somehow, causing a sort of divergence to infinite complexity?)
And also, if BB were computable, then it could be used to solve the halting problem: run the Turing machine of size n for BB(n) steps, and if it hasn't halted yet, it never will. So the BB function is clearly not computable.
But to me as a layman that seems true regardless of formal axioms chosen, but I guess I need to read that linked thesis.
I am also not an expert, but this does not sound right to me. Godel's incompleteness theorem shows that there are certain things that cannot be proven. Being independent of ZFC means that something is such a case. So BB(643) being independent of ZFC means that we cannot prove or disprove that a certain number is BB(643). Aka we don't have the math to know for certain.
Independence from ZFC means we can't prove that any given number is BB(643) using ZFC. It doesn't mean we can't prove it at all, e.g. one could use a stronger set theory like NBG which can prove the consistency of ZFC to verify the value of BB(643). But there would be some n for which BB(n) is independent of that set theory, requiring a yet-stronger theory, and so on ad infinitum.
ZF & ZFC are as important as they are because they're the weakest set theories capable of working as the foundations of mathematics that we've found. We can always add axioms, but taking axioms away & still having a usable theory on which to base mathematics is much more difficult.
sure, but it is still very hard to wrap one's head around how the value of a function can be independent of ZFC, and how it could not be for (e.g.) 642 but then be true for 643. That was the point of my post. It seems like you could just... run the function on every 643-state input and see what the value is, which would in some sense constitute a "proof" in ZFC? but maybe not, because you wouldn't even know if you had the answer? That's the part that is so intriguing about it.
Some 643-state inputs never halt. Some 643-state inputs do eventually halt. Only if you can run them for infinite time can you determine whether a given machine halts in a finite length of time: for any finite time you pick, if the machine is still running it could still halt eventually. That's just the halting problem, the impossibility of solving it is quite famous and it's easy to find the proof stated more formally than I want to with the limits of HN's markdown.
The interesting bit is they were able to construct a machine that halts if ZFC is consistent. Since a consistent axiomatic system can never prove its own consistency (another famous proof) ZFC can't prove that this machine halts. And ZFC can't prove that it never halts without running it for infinite steps.
That ZFC-consistency-proving machine has 643 states, so BB(643) either halts after the ZFC-consistency-proving machine or the ZFC-consistency-proving machine never halts. If BB(643) halts after the ZFC-consistency-proving machine, then ZFC is consistent and ZFC can't prove BB(643) halts since ZFC can't prove the ZFC-consistency-proving machine halts.
Yes, which is why ZFC gets used. NBG & MK are stronger and occasionally used, but ZFC being weaker meant it got more popular since it's almost always good enough.
Yeah, but the vexing part is "how can that be true for e.g. N=643 but not N=642"? What happens at whatever number it starts true at?
Incidentally, Gödel's theorem eventually comes down to a halting-like argument as well (well, a diagonal argument). There is a presentation of it that is in like less than one page in terms of the halting problem---all of the Gödel-numbering stuff is essentially an antiquated proof. I remember seeing this in a great paper which I can't find now, but it's also mentioned as an aside in this blog post: https://scottaaronson.blog/?p=710
> What happens at whatever number it starts true at?
Usually, "what happens" is that the machines become large enough to represent a form of induction too strong for the axioms to 'reason' about. It's a function of the axioms of your theory, and you can add more axioms to stave it off, but of course you can't prove that your new axioms are consistent without even more axioms.
> There is a presentation of it that is in like less than one page in terms of the halting problem---all of the Gödel-numbering stuff is essentially an antiquated proof.
Only insofar as you can put faith into the Church–Turing thesis to sort out all the technicalities of enumerating and verifying proofs. There still must be an encoding, just not the usual Gödel numbering.
> Incidentally, Gödel's theorem eventually comes down to a halting-like argument as well (well, a diagonal argument).
> There is a presentation of it that is in like less than one page in terms of the halting problem
Those are two very different ideas. Your second sentence says that Gödel's theorem is easy to prove if you have results about the halting problem. Your first one says that in order to prove Gödel's theorem, you need to establish results about the halting problem.
I'm saying that if you want to understand why Gödel's theorem is true, look at the one-paragraph proof based on the halting problem, not the like 20-page one with Gödel numbers.
> Most of these 'uncomputable' problems are uncomputable in the sense of the halting problem: you can write down an algorithm that should compute them, but it might never halt. That's the sense in which BB(x) is uncomputable: you won't know if you're done ever, because you can't distinguish a machine that never halts from one that just hasn't halted yet (since it has an infinite number of states, you can't just wait for a loop).
> So presumably the independence of a number from ZFC is like that also: you can't prove it's the value of BB(745) because you won't know if you've proved it; the only way to prove it is essentially to run those Turing machines until they stop and you'll never know if you're done.
These aren't similar ideas. You can't know if a machine that hasn't halted yet will ever halt. But you can easily know if a machine that has already halted was going to halt.
Independence is the second case. For the value of BB(x) to be independent of ZFC, one of two things must hold:
(1) ZFC is inconsistent, and therefore all statements are independent of it.
(2) ZFC is consistent with two different statements, "BB(x) = a" and "BB(x) = b" for two different a, b. This means that a disproof of either statement cannot exist.
This, in turn, means that there is no observation you could ever make that would distinguish between the values a and b (for the identity of BB(x)). No matter what you believe the value of BB(x) might secretly be, there are no consequences; nothing anywhere could ever change if the value turned out to be different. Because, if there were an observable consequence of the value being different, the hypothetical observation of that consequence would be a disproof of the value that didn't cause it, and no such disproof can exist.
Neither value, a or b, can be more true than the other as the answer to the question "what is BB(x)?". It doesn't make sense to consider that question to have an answer at all.
> (2) ZFC is consistent with two different statements, "BB(x) = a" and "BB(x) = b" for two different a, b. This means that a disproof of either statement cannot exist.
> This, in turn, means that there is no observation you could ever make that would distinguish between the values a and b (for the identity of BB(x)). No matter what you believe the value of BB(x) might secretly be, there are no consequences; nothing anywhere could ever change if the value turned out to be different. Because, if there were an observable consequence of the value being different, the hypothetical observation of that consequence would be a disproof of the value that didn't cause it, and no such disproof can exist.
There's one part of this I don't understand. "BB(x) = n" means "there is at least one x-state Turing machine that halts after exactly n steps, and there are no x-state Turing machines that halt after more than n steps", right? Then why wouldn't this approach work (other than the numbers being way too big to actually do in this universe)? WLOG, assume a < b. Run all possible x-state Turing machines for b steps. If any halted on step b, then you've disproved "BB(x) = a". If not, then you've disproved "BB(x) = b".
The trick is that if none halt on `b` steps, you don't know that BB(x)<b. Specifically, if you have one TM that keeps going, you don't know whether that TM halts eventually or keeps going forever.
It has to come from a finite value (specifically, the amount of complexity that can be enocoded in 745 pieces of information https://turingmachinesimulator.com/shared/vgimygpuwi), because the finite size 745 with infinite tape leads to uncomputability, but the size 5 does not.
In a very real sense, a deep kind of infinite complexity can be generated from 745 objects of certain kind, but not from 5 objects of that kind..
Turing machines have infinite tape, not infinite state. The entire set of all halting machines of a given size collectively only use finite tape. Totally finite. Only (some of) the non-halting machines use infinite tape.
The problem is that we don't know in advance how large the (definitely finite) upper bound on the amount of tape all the size-N halting machines use, until after enough of them (one per known equivalence class) halt. And we don't know (in general) how to run all the halting ones until they halt, without also running a non-halting program for an unbounded amount of time.
TL:DR: unbounded is not infinite, but big enough to be a problem.
I am aware it's an infinite tape and finite state (maybe I misspoke somewhere), as well as the halting machines using finite tape (because of course they do).
But the overall 'complexity' (at a timestep, say) is going to be due to the states and the tape together. The BB(5) example that was analyzed, iirc, was a Collatz-like problem (Aaronson describes it here: https://scottaaronson.blog/?p=8088 ). My interpretation of this is that:
1. collatz-like functions have a lot of complexity just due to math alone
2. 5 states turned out to be enough to "reach" that one that
3. more states means you're going to reach more possible Collatz-like functions (they don't have to be Collatz-like; it's just easier to think about them like that)
4. eventually you reach ones that ZFC cannot show to halt, because there is effectively no way to prove it other than running them, and then you would have to solve the halting problem.
The part that was helpful for me to be less unsettle by BB(745) being independent of the ZFC was the notion that it eventually boils down to a halting problem, and asking ZFC to "solve" it... which is more agreeable than the idea that "ZFC cannot compute a function that seems to be solvable by brute force".
We need to distinguish between a computer that's equivalent to BB(n), and a computer big enough to compute the value of the number that is BB(n). By (terrible) analogy: a 4004 can be made to write a finite loop that describes how many FLOPs the number 1 supercomputer can compute without, itself, being able to usefully perform the computations of that supercomputer. (The 4004 will run out of memory/addressable disk space.) Similarly, we can no longer build decidable programs in ZFC that can compute the number BB(748). Scott is saying that they now think this "disassociation" might occur at BB(7)!
To try and help people digging into this, the following helped me.
Two lenses for trying to understand this are potentially Chastain's limits on output of a lisp program being more complex than the program itself [1] or Markov's proof that you can't classify manifolds in d>= 4.
If you try the latter and need/want to figure out how the Russian school is so different this is helpful [2]
IMHO the former gives an intuition why, and the latter explains why IMHO.
In ZFC, C actually ends up implying PEM, which is why using constructionism as a form of reverse math helped it click for me .
This is because in the presence of excluded middle, every sequentially complete metric space is a complete space, and we tend to care about useful things, but for me just how huge the search space grows was hidden due to the typical (and useful) a priori assumption of PEM.
If you have a (in my view) dislike for the constrictive approach or don't want/have to invest in learning an obscure school of it,
This recent paper[3] on the limits for finding a quantum theory of everything is another lens.
Yet another path is through Type 2 TMs and the Borel hierarchy, where while you can have a uncomputable number on the input tape you algorithms themselves cannot use them, while you can produce uncomputable numbers by randomly selecting and/or changing an infinite sequence.
Really it is the difference between expressability and algorithms working within what you can express.
Hopefully someone else can provide more accessible resources. I think a partial understanding of the limits of algorithms and computation will become more important in this new era.
Looking at [3], they seem to argue that the system isn’t complete for the usual Gödel reasons, which, sure, it isn’t, but then they call the claim that the system fails to decide, which is a statement about probability, a “scientific fact”. This seems to me like a mistake?
Like, a TOE is not expected to decide all statements expressible in the theory, only to predict particular future states from past states, with as much specificity as such past states actually determine the future states. It should not be expected to answer “given a physical setup where a Turing machine has been built, is there a time at which it halts?” but rather to answer “after N seconds, what state is the machine (as part of the physical system) in?” (for any particular choice of N).
Whether a particular statement expressed in the language of the theory is provable in the theory, is not a claim about the finite-time behavior of a physical system, unless your model of physics involves like, oracle machines or something like that.
Edit: it later says:
“
Chaitin’s theorem states that there exists a constant
K_{ℱ_{QG}}
, determined by the axioms of
ℱ_{QG}
, such that no statement
S with Kolmogorov complexity
K(S) > K_{ℱ_{QG}}
can be proven within ℱ_{QG}
.”
But this, unless I’m badly misinterpreting it, seems very wrong? Most formal systems of interest have infinitely many distinct theorems. Given an infinite set of strings, there is no finite universal upper bound on the Kolmogorov complexity of the strings in that set.
Maybe this was just a typo or something?
They do then mention something about the Bekenstein bound, which I haven’t considered carefully yet but seems somewhat more promising than the parts of the article that preceded it.
It looks like the authors of [3] misunderstood Chaitin. What Chaitin said about the limits of provability is that no statements of the form "K(x) > c_F" can be proven in formal system F where c_F is some constant depending on F.
I will admit that I added that cite mostly because of the very real barriers to even learning RUSS.
By the typos etc.. you. can probably also tell I was doing this on mobile, unfortunately as a passenger in a car.
To quote Chaitin’s explanation here:
> In contrast I would like to measure the power of a set of axioms and rules of inference. I would like to be able to say that if one has ten pounds of axioms and a twenty-pound theorem, then that theorem cannot be derived from those axioms.
This paper's notation does seem to be confusing, but I still think it is essentially complete with the above.
"K_{ℱ_{QG}}" would probably most commonly be L in most descriptions, a natural number that is the upper bound of complexity for provable statements in a formal system S
L is not a limit on complexity, it means that there is no formal proof for S that its Kolmogorov complexity exceeds L, for any string.
You can still prove that there are strings far more complex than L with S, and in fact there will often be far more of those strings than the ones equal to or less than L.
It is a limit on what you can prove about those strings with a greaterKolmogorov complexity in S.
Or to rewrite the above:
"There exists a natural number L such that we can't prove the Kolmogorov complexity of any specific string of bits is more than L."
Does that help or did I miss the mark on your objection?
Their notation of “ K_{ℱ_{QG}}” wasn’t a problem. Seems a reasonable name for a constant associated with Kolmogorov complexity and a formal system which they’ve named ℱ_{QG}.
The issue is that what they said seemingly was not
"There exists a natural number L such that we can't prove the Kolmogorov complexity of any specific string of bits is more than L."
But
"There exists a natural number L such that we can't prove (in ℱ_{QG}) any statement S whose complexity is more than L.",
which is wrong.
They later go on to say “These strings cannot be generated by programs of length <= n, and hence cannot correspond to provable statements in ℱ_{QG}.” which follows from the previous wrong statement but doesn’t follow from the accurate statement you gave, which seems to suggest that they really did mean the inaccurate statement that they wrote, not the correct one you wrote.