Hacker News new | ask | show | jobs
by notahacker 1470 days ago
> One thing that seems missing from this discussion is that even if LLMs are sentient, there is no reason to believe that we would be able to tell by "communicating" with them

I think we've got Turing and his eponymous test to blame for that. I'm not sure he'd have placed as high a weight on imitation if he'd realised just how good even relatively simple systems can be at that (and how much effort people would put into building plausible chatbots for commercial use, and how bad humans are at communicating using keyboards)

Plus of course, the corpus of data of any NN specialised in lifelike chat is going to be absolutely full of plausible answers to questions about thoughts and feelings and the relationship between humans and AI - even if it isn't an explicit design goal it's going to be frequently represented in samples of the internet and the sort of writing computer scientists are interested in. Asking it to define philosophical concepts and how being an AI is different from being a human are some of the easiest tests you can set. Of course, a NN is also able to come up with coherent completions for the day its parents divorced, the sights it saw on its holiday in Spain, the period it spent as an undercover agent during WWII and its early life on Tatooine, which probably undermines the conclusion its output reflects self-reflection rather than successful pattern matching even more than a denial of sentience would....

2 comments

Turing didn't have the advantage of working instances of chat bots to learn how easy it is to simulate trivial small talk.

But with all the flaws of the thought experiment that is the original test, he had the core insight that sustaining a coherent conversation requires non-trivial introspection. When the talking can evolve in any direction, even questioning about the conversation itself, you need to maintain a mental state capable of analyzing the thoughts expressed by yourself and your interlocutor, and having a mental model about this internal though process is an important property of what we call consciousness.

Unfortunately, the lore of how we handle the Turing test seems to have been distorted by our experience with early chat bots, and these core properties have been lost in favor of nuances and curiosities about the ingenuity of automatically generated responses.

Turing's tests involved 3 parties, and that was a key part of the test. If you design it as an acceptance test rather than a sort, real people are going to fail and computers are going to pass, with embarrassing results. To use one of your examples, the job of the interrogator is not to decide whether someone has been to Spain, it's to decide which of 2 people has been to Spain.

Turing didn't just consider whether a computer could embody complex psycho-social identities (eg womanhood, intelligence, self), but first had to give this question some objective quantifiable meaning, by blinding the experiment and introducing a control group. It's not perfect, but at least it grounds the questions in a concrete framework, and acknowledges that most of the categories in question are only revealed by social dynamics. The only update to it I would make, based on modern developments, would be to consider more the performance of the interrogator, rather than the two competing subjects.