Hacker News new | ask | show | jobs
by muskmusk 1134 days ago
> I don’t know why but I find this troubling.

You used the word anthropomorphize twice so I am guessing you don't like building systems whose entire premise rest on anthropomorphization. Sounds like a reasonable gut reaction to me.

I think another way to think of all of this is: LLM's are just pattern matchers and completers. What the training does is just to slowly etch a pattern into the LLM that it will then complete when it later sees it in the wild. The pattern can be anything.

If you have a pattern matcher and completer and you want it to perform the role of configurable chatbot. What kind of patterns would choose for this? My guess is that the whole system/assistant paradigm was chosen because it is extraordinarily easy to understand for humans. The LLM doesn't care what the pattern is, it will complete whatever pattern you give it.

> And part of what I find troubling is how casually people (prompters and users) are willing to go along with the ‘you are a chatbot’ fiction.

That is precisely why it was chosen :)

1 comments

> you don't like building systems whose entire premise rest on anthropomorphization

I think I don't like people building systems whose entire premise rest on anthropomorphization - while at the same time criticizing anyone who dares to anthropomorphize those systems.

Like, people will say "Of course GPT doesn't have a world model; GPT doesn't have any kind of theory of mind"... but at the same time, the entire system that this chatbot prompting rests on is training a neural net to predict 'what would the next word be if this were the output from a helpful and attentive AI chatbot?'

So I think that's what troubles me - the contradiction between "there's no understanding going on, it's just a simple transformer", and "We have to tell it to be nice otherwise it starts insulting people."

Anthropomorphism is the UI of ChatGPT. Having to construct a framing in which the expected continuation provides value to the user is difficult, and requires technical understanding of the system that a very small number of people have. As an exercise, try getting a "completion" model to generate anything useful.

The value of ChatGPT is to provide a framing that's intuitive to people who are completely unfamiliar with the system. Similar to early Macintosh UI design, it's more important to be immediately intuitive than sophisticated. Talking directly to a person is one immediately intuitive way to convey what's valuable to you, so we end up with a framing that looks like a conversation between two people.

How would we tell one of those people how to behave? Through direction, and when there is only one other person in the conversation our first instinct when addressing them is "you". One intuitive UI on a text prediction engine could look something like:

"An AI chatbot named ChatGPT was having a conversation with a human user. ChatGPT always obeyed the directions $systemPrompt. The user said to ChatGPT $userPrompt, to which ChatGPT replied, "

Assuming this is actually how ChatGPT is configured i think it's obvious why we can influence its response using "you": this is a conversation between two people and one of them is expected to be mostly cooperative.

(https://twitter.com/ynniv/status/1657450906428866560)