Hacker News new | ask | show | jobs
by InGoldAndGreen 379 days ago
The "LLMs perspective" section is hiding at the end of this notion is a literal goldmine
3 comments

No, it's completely useless, and puts the entire rest of the analysis in a bad light.

LLMs have next to no understanding of their own internal processes. There's a significant amount of research that demonstrates this. All explanations of an internal thought process in an LLM are completely reverse engineered to fit the final answer (interestingly, humans are also prone to this – seen especially in split brain experiments).

In addition, the degree to which the author must have prompted the LLM to get it to anthropomorphize this hard makes the rest of the project suspect. How many of the results are repeated human prompting until the author liked the results, and how many come from actual LLM intelligence/analysis skill?

By saying that's its gold mine, I think OP meant that's it's funny, not that it brings valuable insight. ie: THEY KNOW -> that made me laugh

and as the article said "an LLM who just spent thousands of words explaining why they're not allowed to use thousands of words", its just funny to read.

The fact that they produce this as “default” response is an interesting insight regardless of its internal mechanisms. I don’t understand my neurons but can still articulate how I feel
It is completely reasonable and often - very - useful to evaluate and interpret instructions with LLMs.

You're stuck on the anthropomorphize semantics, but that wasn't the purpose of the exercise.

It's sure phrased like one, but I'd be careful to attribute LLM thought process to what it says it's thinking. LLMs are experts at working backwards to justify why they came to an answer, even when it's entirely fabricated
> even when it's entirely fabricated

I would go further and say it's _always_ fabricated. LLMs are no better able to explain their inner workings than you are able to explain which neurons are firing for a particular thought in your head.

Note, this isn't a statement on the usefulness of LLMs, just their capability. An LLM may eventually be given a tool to enable it to introspect, but IMO its not natively possible with the LLM architectures today.

There's a slight exception to this, in that LLMs are able to accurately describe portions of the buffer that are arbitrarily hidden from the user.

An LLM that says "I said orcs are green because I recalled a scene in lord of the rings..." is fabricating*. An LLM that says "I talked about white genocide because my system prompt told me to" is very likely telling the truth because it can literally see the system prompt as it generates the output. Even though in the situation I'm referring to the system prompt was hidden from users. It's a logical conclusion from the combination of the system prompt and its previous output that that is why its previous output is what it is (that anyone could make with the same degree of confidence if they had access to the full buffer).

* Unless it's reading back from a <thinking> section of the buffer that was potentially hidden from the user.

It's the best thing I've read from an LLM!

It sounds a lot like like the Murderbot character in the AppleTV show!

Right… because these things are trained on sci-fi and so when asked to describe an internal monologue they create text that reads like an internal monologue from a sci-fi character.

Maybe there’s genuine sentience there, maybe not. Maybe that text explains what’s happening, maybe not.

> Maybe that text explains what’s happening, maybe not

It would have been cool to see what prompt was used for that page!

Yes, so that one can use it for more creative writing exercises. It was pretty creative, I'll give it that.