He originally made the argument about gender, not intelligence. I think he was arguing for a whole class of properties for which there's no difference between authenticity and convincing fakery.
I think the point is less that there is a truth and we're too dumb to figure it out, and more that in certain circumstances we'll just have to accept a lower bar for evidence about whether those properties apply.
It reminds me of how no class of computer can solve the halting problem for itself. No matter how intelligent you are, there will be holes like this in your certainty about some things.
Or another way to put this, it's not a binary problem, it's a probability continuum.
Even the definition of 'human intelligence' is a continuum from the smartest to the dumbest of us, that doesn't even stop there and descends thought all animal life.
I did some research to prove you wrong, because I don't think continuum is the right concept, but it turns out that Turing seems to agree with you. Quoting him in "Computing Machinery and Intelligence":
> In short, then, there might be men cleverer than any given machine, but then again there might be other machines cleverer again, and so on.
So now I think you're both wrong :) Particularly I take issue with the assumption that the "cleverer" relation is transitive. We've only really studied a few relations in this space:
- pushdown automatons are cleverer than finite state machines
- turing machines are cleverer than pushdown automata
- humans are cleverer than turing machines (I'd argue for this, but others would disagree)
Presumably there are other points which we have overlooked or not yet discovered. For instance, maybe something which has the "memory" quality of a pushdown automaton, but lacks the "state tracking" property of a finite state machine. When compared with an FSM, such a thing would not be more or less clever than it, it would just be clever in an orthogonal way.
I strongly suspect that two intelligences (of greater power than the theoretical machines that we yet have) could meet and discover that they each have a capability that the other lacks. This would be a situation that you couldn't map onto a continuum--you'd need something with branches, a tree or a dag or a topological space: something on which the two intelligences could be considered cousins: neither possessing more capabilities than the other, but each possessing different capabilities. (Unlike the FSM example, they would have to share some capabilities, otherwise they couldn't recognize each other as intelligent).
Further, I suspect that in order to adequately classify both intelligences as cousins, you'd have to be cleverer than both. Each of the cousin-intelligences would be able to prove among themselves that theirs is the superior kind, but they'd also have to doubt these proofs because the unfamiliar intelligence would be capable of spooky things which the familiar intelligence is not.
I mean an evolutionary tree where intelligence features are added in some branches makes sense.
I guess part of what I was trying to address is that we like to think of intelligence as what people do and are the pinnacle of, and discounting anything that is not covered by that.
I definitely agree that defining intelligence as what humans do is a problematic practice. I guess I just wanted to nit pick a little.
There's definitely a lot of "it's not real intelligence because it's not human intelligence" going around these days. Doesn't seem like it's going anywhere useful though.
To me, Turing's argument had always been that the attribution of intelligence (or not) doesn't make much sense: rather than being a question of any substance, it merely diffuses into a matter of appearance. However, as things usually are, it had become the holy grail for claiming "intelligence" (which really should be used in this context in quotes only).
In actuality, it primarily tests the knowledge of the user, not their intelligence. I'm sure that GPT-written paragraphs seemed really impressive when you first encountered them, but nowadays half the Internet has seen enough of them to recognise the default ChatGPT style in less than the space of a Tweet. People aren't significantly smarter than they were a few years ago, but I bet GPT-3.0 will perform significantly worse on a Turing test now than it did the day it was released.
Similarly, I believe a lot of early Turing Test successes kind-of cheated and had their bots pretend to be ESL, on the grounds that the interrogators would interpret their unnatural responses not as a robot but as a second-language speaker's human mistakes. But people who teach English as a second language, or interact with language-learners a lot will learn the types of mistakes each group of learners make, and will spot unnatural mistakes a lot faster.
Now that I think about it, that a major factor in determining Turing test performance isn't the intelligence of the testers but their knowledge does highlight why it's not a great measure of intelligence in the first place.
I've often felt that a better version is not whether a person can guess that it's AI or a human, but whether people behave and feel differently with an AI or human.
That's vague and covers a universe of criteria — mood, satisfaction with the conversation, actual behavior and so forth — but it also I think is a more realistic gauge of AI performance. It's probably unattainable but that's not necessarily a bad thing. If it is attainable within confidence then it's a pretty powerful AI.
There are probably some people who would be ok with some AI for some purposes.
In a sense, the question of the "intelligent machine" is somewhat self-contradictory: To us, the question of intelligence matters as a preposition or qualifying term, for to what extent, probability and prospects we may pose an appeal to sympathy, moral and ethics. (In other words, it is not about trust in any realistic faculties, but about judgement – and then, to what extent we may trust in this.) However, this prospect doesn't fit well our expectations towards machines, which are all about repeatability and reproducible results in given tolerances… (Compare ChatGPT's so-called winter depression and the arising need to plead and argue with the device for any complex results. As the device gains in the emotional domain, its worth in the application domain radically decreases.)
I think the point is less that there is a truth and we're too dumb to figure it out, and more that in certain circumstances we'll just have to accept a lower bar for evidence about whether those properties apply.
It reminds me of how no class of computer can solve the halting problem for itself. No matter how intelligent you are, there will be holes like this in your certainty about some things.