| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TeMPOraL 379 days ago

I'm curious how you ended up in such a conversation in the first place. Hallucinations are one thing, but I can't remember the last time when the model was saying that it actually run something somewhere that wasn't a tool use call, or that it owns a laptop, or such - except when role-playing.

I wonder if the advice on prompting models to role play isn't backfiring now, especially in conversational setting. Might even be a difference between "you are an AI assistant that's an expert programmer" vs. "you are an expert programmer" in the prompt, the latter pushing it towards "role-playing a human" region of the latent space.

(But also yeah, o3. Search access is the key to cutting down on amount of guessing the answers, and o3 is using it judiciously. It's the only model I use for "chat" when the topic requires any kind of knowledge that's niche or current, because it's the only model I see can reliably figure out when and what to search for, and do it iteratively.)

3 comments

westoncb 379 days ago

I've seen that specific kind of role-playing glitch here and there with the o[X] models from openai. The models do kinda seem to just think of themselves as being developers with their own machines.. I think it usually just doesn't come up but can easily be tilted into it.

link

bradly 379 days ago

What is really interesting is in the "thinking" section it said "I need to reassure the user..." so my intuition is that it thought it was right, but did not think I would think they were right, but if they just gave me the confidence, I would try the code and unblock myself. Maybe it thought this was the best % chance I would listen to it and so it is the correct response?

link

TeMPOraL 379 days ago

Maybe? Depends on what followed that thought process.

I've noticed this couple times with o3, too - early on, I'd catch a glimpse of something like "The user is asking X... I should reassure them that Y is correct" or such, which raised an eyebrow because I already know Y was bullshit and WTF with the whole reassuring business... but then the model would continue actually exploring the question and the final answer showed no trace of Y, or any kind of measurement. I really wish OpenAI gave us the whole thought process verbatim, as I'm kind of curious where those "thoughts" come from and what happens to them.

link

ben_w 379 days ago

Not saying this to defend the models as your point is fundamentally sound, but IIRC the user-visible "thoughts" are produced by another LLM summarising the real chain-of-thought, so weird inversions of what it's "really" "thinking" may well slip in at the user-facing level — the real CoT often uses completely illegible shorthand of its own, some of which is Chinese even when the prompt is in English, but even the parts in the users' own languages can be hard-to-impossible to interpret.

To agree with your point, even with the real CoT researchers have shown that model's CoT workspace don't accurately reflect behaviour: https://www.anthropic.com/research/reasoning-models-dont-say...

link

andrepd 379 days ago

Okay. And the fact that LLMs routinely make up crap that doesn't exist but sounds plausible, and the fact that this appears to be a fundamental problem with LLMs, this doesn't give you any pause on your hype train? Genuine question, how do you reconcile this?

> I really wish OpenAI gave us the whole thought process verbatim, as I'm kind of curious where those "thoughts" come from and what happens to them.

Don't see what you mean by this; there's no such thing as "thoughts" of an LLM, and if you mean the feature marketers called chain-of-thought, it's yet another instance of LLMs making shit up, so.

link

TeMPOraL 378 days ago

> And the fact that LLMs routinely make up crap that doesn't exist but sounds plausible, and the fact that this appears to be a fundamental problem with LLMs, this doesn't give you any pause on your hype train? Genuine question, how do you reconcile this?

Simply. Because the same is the case with humans. Mostly for the same reasons.

(Are humans overhyped? Maybe?)

The LLM hype train isn't about them being more accurate or faster than what came before - it comes from them being able to understand what you mean. It's a whole new category of software - programs that can process natural language like humans would; a powerful side effect that took the world by surprise is, that making LLMs better at working with natural language implicitly turns them into general-purpose problem solvers.

> Don't see what you mean by this; there's no such thing as "thoughts" of an LLM, and if you mean the feature marketers called chain-of-thought, it's yet another instance of LLMs making shit up, so.

"Chain-of-thought" is so 2024; current models don't need to be told to "think step by step", they're post-trained to first generate a stream of intermediary tokens not meant as "actual answer", before continuing with the "actual answer". You can call it however you like; however both research literature and vendors settled on calling it "thinking" or "reasoning". Treat them as terms of art, if that helps.

link

bradly 379 days ago

Ehh... I did ask it if it would be able to figure this out or if I should try another model :|

link

agos 379 days ago

A friend recently had a similar interaction where ChatGPT told them that it had just sent them an email or a wetransfer with the requested file

link