Hacker News new | ask | show | jobs
by astrange 1198 days ago
> When Bing/Sidney first lamented its existence it became quite apparent that either LLMs are more capable than we thought or we humans are actually more of statistical token machines than we thought.

Some of the reason it was acting like that is just because MS put emojis in its output.

An LLM has no internal memory or world state; everything it knows is in its text window. Emojis are associated with emotions, so each time it printed an emoji it sent itself further into the land of outputting emotional text. And nobody had trained it to control itself there.

1 comments

You are wrong. It does have encoded memory of what it has seen, encoded as a matrix.

A brain is structurally different, but the mechanism of memory and recall is comparable though the formulation and representation is different.

Why isn't a human just a statistic token machine with memory? I know you experience it as being more profound, but that isn't a reason that it is.

> You are wrong. It does have encoded memory of what it has seen, encoded as a matrix.

Not after it's done generating. For a chatbot, that's at least every time the user sends a reply back; it rereads the conversation so far and doesn't keep any internal state around.

You could build a model that has internal state on the side, and some people have done that to generate longer texts, but GPT doesn't.

Yes but for my chat session, as a "one time clone" that is destroyed when the session ends, it has memory unique to that interaction.

There's nothing stopping OpenAI using all chat inputs to constantly re-train the network (like a human constantly learns from its inputs).

The limitation is artificial, a bit like many of the arguments here trying to demote what's happening and how pivotal these advances are.

But where is your evidence that the brain and an LLM is the same thing? They are more than simply “structurally different”. I don’t know why people have this need to ChatGPT. This kind of reasoning seems so common HN, there is this obsession to reduce human intelligence to “statistic token machines”. Do these statistical computations that are equivalent to LLMs happen outside of physics?