Hacker News new | ask | show | jobs
by chriswarbo 3202 days ago
> The proof is like a game. It says: give me any procedure for (putatively) making a list of all of the real numbers, and I will give you back a number that is not in the list.

The game isn't fair though; we have to write down our complete solution (e.g. in the form of a never-halting, co-recursive computer program), then Cantor can take as much time as he likes (any finite number of steps) to analyse the source code and find a Real it'll never output.

Yet we can turn the tables to play a different game: whatever counter-example-finder Cantor chooses, if we make him write it down (as a computer program), we can always (eventually) find a program which will trick it: outputting only a countable number of Reals, but always a superset of those checked for by Cantor's program.

Of course, if we're representing the players of the game using computer programs, then we can go one step further and show that only the Computable numbers (a subset of the Reals) can ever be generated, and the computables are countable! (We can count them by pairing off all programs with all runtimes)

1 comments

> if we make him write it down (as a computer program)

He did. Diagonalization is an algorithm. It's a non-halting algorithm (because it operates on infinite input and produces infinite output) but it is nonetheless a perfectly well defined algorithm that can be implemented by a Turing Machine.

> we can always (eventually) find a program which will trick it

No, you can't.

You're right about diagonalization when it's given our choices as input. When I said 'writing down' I meant 'in a self-contained way', i.e. without taking any input (such as our choices).

This rules out diagonalization, since that can't be run until it has a list of numbers to diagonalize. In the 'reverse' game, we're first running the counter-example-outputting program that Cantor provides, and then using its output to pick our numbers. If Cantor gives us a diagonalization program, we'll get a deadlock. In fact, the strategy to beat Cantor in this reverse game is simply the identity function: if Cantor's program claims that "X" isn't in our list, we can refute it by outputting "X".

The way I think about it is this: if we give Cantor our program up-front, he can run it to see what we're going to choose, then pick his 'move' accordingly (e.g. via diagonalization); if Cantor gives us his (self-contained) program up-front, we can run it to see what 'move' he's going to choose, and pick ours accordingly (e.g. via the identity function).

Both games are a win for whoever goes second.

Alternatives to the 'up-front' approach include:

- Interleaving or concurrent outputs, in which case there would be no winner: each time a number is added to the list, diagonalization extends its counter-example with another digit; each time the counter-example is extended, a number with matching prefix is added to the list.

- Interleaving or concurrent programs, with access to each others' source. I think the winner would be whoever has the most computing power (and can hence simulate their opponent further into the future).

> if Cantor gives us his (self-contained) program up-front,

Which he did. Diagonalization is an algorithm (and a trivial one at that).

> we can run it to see what 'move' he's going to choose

No, you can't, because one of the inputs to his algorithm is your algorithm. So you can't run his algorithm until you've committed to yours.

> Both games are a win for whoever goes second.

That's right. But Cantor has to go second. That's why his proof is correct.

Sorry if I wasn't making myself clear: I'm not at all trying to refute Cantor's proof by diagonalization of the uncountability of the reals. There seem to be others in this thread who do take issue with it, but I don't.

It looks like my point's been misunderstood. I'm pointing out that the whole reason that Cantor's proof works (and it does work) is because we're forced to provide our complete list up-front. Or, for those who don't like the idea of 'writing down' something infinite, we must provide a 'description' or 'program' which uniquely defines/generates our list.

You say as much here:

> you can't run his algorithm until you've committed to yours.

Exactly. In Cantor's 'game', his counter-example (a real number not in our list) is generated with full knowledge of our list. Our list is generated with no knowledge of his counter-example.

That's Cantor's proof, and it is not what I'm talking about.

I'm talking about a different 'game', where the dependency is reversed: the "counter-example" is generated with no knowledge of the list; the list is generated with full knowledge of the "counter-example", and hence is able to refute it (by including the "counter-example" in the list).

This 'inverted' setup has two important differences from Cantor's setup:

1) We must replace the program/description of 'a list of reals' with a program/description of 'a function from a real to a list of reals'.

2) We must replace the program/description of 'a function from a list of reals to a real' with a program/description of 'a real number'

In particular, point (2) means that diagonalization is irrelevant in this setup (which I repeat is not Cantor's setup!). It is irrelevant because it doesn't have the right type: it is a function from a list of real numbers to a real number but there is no place for such a function in this alternative setup; instead, we need a plain old real number. This is what I was trying to get across by saying it must be 'written down up-front' (i.e. not after waiting for us to think of a list), or being 'self-contained' (i.e. not requiring an input/reference to our list).

I hope that makes it clear why the following refutations aren't applicable:

> > if Cantor gives us his (self-contained) program up-front,

> Which he did. Diagonalization is an algorithm (and a trivial one at that).

Given what I've said above, we see that diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across. Yes, we can write a program for diagonalization, but such a program encodes a function, which has no place in this scenario (which is not the one in Cantor's proof!).

> one of the inputs to his algorithm is your algorithm

Again, the algorithm cannot take my/your/anyone's algorithm as an input, since it cannot take anything as an input, since it is not a function, since I am not talking about Cantor's proof by diagonalization of the uncountability of the real numbers. I am talking about a different situation, in which the "counter-example" is just a number, and hence cannot be provided with any inputs, whether they be algorithms or unicorns.

> > Both games are a win for whoever goes second.

> That's right. But Cantor has to go second.

By "both games" I mean Cantor's proof and this different situation. Cantor doesn't have to go second; that's my point! Cantor wins because he chooses to go second. If we choose to go second, then we will win, by constructing a list which includes the supposed "counter-example".

> That's why his proof is correct.

I know it is; but I've never claimed otherwise, and that is completely beside the point.

The point is that Cantor's proof isn't actually about the real numbers! Instead, it's about how some infinite structures (like "all numbers") are too rich to be completely captured by any particular finite representation (there will always be missing numbers). It suffices to use a countable infinity, like the computable numbers.

Alternatively, we can think of Cantor's proof as talking about computational resources: diagonalization is trivial to state, as you say, but it is computationally difficult to run. This is because it runs the list-generator as a subroutine (to find the digits), which can be made arbitrarily hard by generating the list in an arbitrarily complex way. Since diagonalization then adds 1 mod 10, it always ends up doing slightly more work than it takes to generate the list. Hence we can see Cantor's proof as statement that more powerful computers can always beat less powerful computers, assuming the code is all globally known, because the faster computers can simulate the slower ones to see what they'll do. That the essence of the second player always winning.

> Sorry if I wasn't making myself clear

You still aren't.

> diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across

But it is. The fact that one of its inputs happens to be a function doesn't make it any less self-contined.

> The point is that Cantor's proof isn't actually about the real numbers!

It isn't just about the reals, but it is definitely about the reals. The claim is: there exist well-defined mathematical structures that cannot be put into a one-to-one correspondence with the naturals. The proof is constructive, with the reals as the canonical example. Cantor's proof is about the real numbers in the exact same sense that Turing's proof of the non-computability of the halting problem is about Turing machines.

> diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across

> But it is.

No it's not. It's a function. That is exactly the opposite of what I mean by "self-contained" (and all of the various rephrasings of how I defined it).

> The fact that one of its inputs happens to be a function doesn't make it any less self-contined.

True, it doesn't become less self-contained, because it's already not self-contained because it has an input. A function whose input is a function is just as self-contained as one whose input is a natural; or whose input is a teapot: they're all, precisely, not "self-contained" as per my definition.

I can't think of any clearer way of stating it. Maybe you're getting derailed by this phrase because you're interpreting it in some way other than the various equivalent, precise definitions I have given over and over?

I will just diagonalize forever before I give the set to Cantor. If he claims that he can diagonalize the set to generate a new number not in the set then I will say "Oh, yeah - I missed one. I wasn't really done after all. Give it back to me and I will add some more to it before you get your hands on it!". As you can see, Cantor will never get a list because I will never really be done generating it.

I could just give him the list and say that whatever he adds is precisely what I would have added had I continued. This is like an infinite lazy list in programming.

> Cantor will never get a list because I will never really be done generating it.

That would be true even if you weren't diagonalizing because the list is infinite. In fact, just a single item in the list is potentially infinite. So you can't generate the whole list regardless.

What you have to do is to produce an algorithm that takes any two natural numbers i and j as input and produces as output in finite time the j'th digit of the i'th number in the list. Because you have to produce your digit in finite time you can only diagonalize a finite number of times, so Cantor can always do you one better and produce a number not on your list.

Why is it that Cantor can do an infinite procedure of diagonalization but I can't? If he can diagonalize then I can too. Is it possible that Cantor's "algorithm" is not really an algorithm? Knuth says an algorithm must be correct and must also terminate.

Regardless, I have a lot to ponder.

Cantor is playing by the exact same rules that you are. His burden is: given an integer i, produce in finite time the i'th digit of a real not in your list. (He can't produce the whole thing in finite time because it's infinite, obviously.) He does this by using your algorithm to produce the i'th digit of the i'th row and adds one to it (mod 10).
> given an integer i, produce in finite time the i'th digit of a real not in your list

Yes; to be clear, my point is simply that there are two variants of this task:

- Given an integer i and our list (or the infinite procedure that generates it), then the task is easy since we can diagonalise.

- Given an integer i and no knowledge of our list, the task becomes impossible, since the supposed "real not in your list" is actually independent of the list (by definition); hence we're free to choose any list we like, including waiting until after the supposed counter-example has been generated, and sticking that at the head of our list.

Okay, let me try to write these "algorithms" in a concrete term.

Assume that you have a way of generating a sequence R_1, R_2, R_3, ... of real numbers in [0, 1]. (For simplicity, let's just consider [0, 1], because it still has the same cardinalityas R.) You have your "algorithm":

    generate_digit(i, j):
        Somehow compute the j'th digit of R_i.
        return the digit
You are allowed to spend infinitely long time to return each digit. (That is, you can use constructions that are not even computable in finite time, as long as you can prove that given i and j, there is exactly one digit that satisfies your criterion.)

Cantor's objective is to defeat your algorithm by generating a real number not in the list. Similarly as above, it can be written as a function that returns the j'th digit, given j. Behold it in its full glory:

    defeat_indexing(j):
        digit = generate_digit(j, j)
        return (digit + 1) % 10
That's it.