|
> What seems to be missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them. Unfortunately, that argument applies to you, yourself. I mean, presumably you know that you yourself are intelligent, but you must take it on faith that everyone else is. We all could just be a kind of Chinese Room, as far as you know. Communicating with us is not a sure way to know whether we are "really" sentient because we could just be automatons, insensate but sophisticated processes, claiming falsely to be just like you. > the LLM will assign some probability to every plausible completion -- so if you sample enough times, it will eventually say, e.g., "Well, I am not sentient." Perhaps so. I think the mistake is trying to split that hair at all. According to BF Skinner we are all automatons, and any sense of self-awareness is an illusion. Some psychologists and animal trainers have found find that model to be quite well explanatory for predicting observed behavior. Is it correct? We will never really know for sure. So, if a skeptical, knowledgeable user guardibg carefully against pareidolia encountered a chatbot that is sufficiently sophisticated to seem sentient to that user, it's tantamount to being sentient. For all practical purposes given our existential solitude, an entity that convinces us of its sentience is sentient, irrespective of any other consideration. Your example implicitly acknowledges that. If LaMDA would make such an elementary error, it must not be sentient. Conversely, if it did not make such errors, it may be sentient. |
Does it? I don’t think it would even apply to a reinforcement learning agent trained to maximize reward in a complex environment. In that setting, perhaps the agent could learn to use language to achieve its goals, via communication of its desires. But LaMDA is specifically trained to complete documents, and would face selective pressure to eliminate any behavior that hampers its ability to do that — for example, behavior that attempts to use its token predictions as a side channel to communicate its desires to sympathetic humans.
Again, this is not an argument that LaMDA is not sentient, just that the practice of “prompting LaMDA with partially completed dialogues between a hypothetical sentient AI and a human, and seeing what it predicts the AI will say” is not the same as “talking to LaMDA.”
Suppose LaMDA were powered by a person in a room, whose job it was to predict the completions of sentences. Just because you get the person to predict “I am happy” doesn’t mean the person is happy; indeed, the interface that is available to you, from outside the room, really gives you no way of probing the person’s emotions, experiences, or desires at all.