If a digital thermometer reads 100C, connected to a black box, are we thereby required to believe that there's boiling water inside the box?
Science doesn't deal with the "indistinguishable". We cannot, on earth, simply distinguish between whether we go around the sun, or the sun goes around the earth.
Does the solar system have a soul?
The world exists, and it has properties, and those are independent of how dumb apes happen to be and what we are in a position to "distinguish" or otherwise.
A system generating text is acting as-if its having its intelligence measured. Each sentence we take to be a symptom of its: having a theory of the enviornment, having something to say about it, having some intention, etc.
When I say, "I don't like what you're wearing!" that sentence itself isnt somehow "intelligent". It is only a valid measure of my caring, preferring, speaking, intending, thinking... becausethat is why i said it.
A shredder which happened to assemble those words is likewise not intelligent.
This is basic science: measurements arent objects; and measurements have validity criteria which is, at least, the causal properties of the system give rise to those measures.
In the case of ChatGPT no relevant properties give rise to its ouptut. Its sentences are not caused by any intelligence, and aren't valid measures of it.
There is no boiling water. Your digital thermometer is broken.
I agree it's a red herring to focus on output and interactive behavior when discussing this.
If a "shadow prompt" told chatGPT that it writes at a 3rd grade level, we wouldn't argue as much over how smart the bot is.
If it omitted the friendly/helpful/deferential assistant stuff, we'd also argue about it less. Bing's initial defensiveness and aggression made it seem even stupider than the mistakes it was making.
They're honing in on better prompts and other configuration that will make the bot seem smarter. It seems smarter to say "I can't answer that question" than to confidently say something untruthful.
But the underlying computational program (GPT trained on the internet) is the same. If we judge the program's intelligence based on its output, it isn't well defined. The same thing looks intelligent or hilariously unintelligent based on the tokens you (an intelligent person) provide it with.
Or in other words... Suppose we collect all of the system's "intelligent" outputs and disregard the rest. We throw away a lot, the majority of responses, and the resulting set looks impressively smart.
The system appears to demonstrate advanced machine intelligence when restricted to (some?) preimages of this set, even though it acts like a total idiot over other parts of the domain. And it's clear that it takes real knowledge and understanding to solve this boundary problem, so that the calculated image has an "intelligent" shape.
Actually we can determine if the Sun goes around the Earth or the other way around - if we can create an better, more accurate model that have larger predictive power then we can assume this model to be more likely to be correct.
As I understand, this was one initially of the main issues with the new model proposed by Copernicus - it was not more accurate initially.
Science doesn't deal with the "indistinguishable". We cannot, on earth, simply distinguish between whether we go around the sun, or the sun goes around the earth.
Does the solar system have a soul?
The world exists, and it has properties, and those are independent of how dumb apes happen to be and what we are in a position to "distinguish" or otherwise.
A system generating text is acting as-if its having its intelligence measured. Each sentence we take to be a symptom of its: having a theory of the enviornment, having something to say about it, having some intention, etc.
When I say, "I don't like what you're wearing!" that sentence itself isnt somehow "intelligent". It is only a valid measure of my caring, preferring, speaking, intending, thinking... because that is why i said it.
A shredder which happened to assemble those words is likewise not intelligent.
This is basic science: measurements arent objects; and measurements have validity criteria which is, at least, the causal properties of the system give rise to those measures.
In the case of ChatGPT no relevant properties give rise to its ouptut. Its sentences are not caused by any intelligence, and aren't valid measures of it.
There is no boiling water. Your digital thermometer is broken.