| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HarHarVeryFunny 491 days ago

My mistake, actually. I was trying to recall "Change my mind!", and ended up with that instead. It was meant as a tongue-in-cheek challenge - I'd be more than happy to hear evidence of why I'm wrong and that there's a more abstract latent space being used in these models, not just something more akin to an elaborated parse tree.

To clarify:

1) I'm guessing that there really isn't a highly abstract latent space being represented by transformer embeddings, and it's really more along the lines of the input token embeddings just getting iteratively augmented/tagged ("transformed") with additonal grammatical and semantic information as they pass through each layer. I'm aware that there are some superposed representations, per Anthropic's interpretability research, but it seems this doesn't need to be anything more than being tagged with multiple alternate semantic/predictive pattern indentifiers.

2) I'd reserve the label "thinking" for what's being called reasoning/planning in these models, which I'd characterize as multi-step what-if prediction, with verification and backtracking where needed. Effectively a tree search of sorts (different branches of reasoning being explored), even if implemented in O1/R1 "linear" fashion. I agree that this is effectively close to what we're doing too, except of course we're a lot more capable and can explore and learn things during the reasoning process if we reach an impasse.

1 comments

eightysixfour 491 days ago

I am not sure how someone would change your mind beyond Anthropic's excellent interpretabilty research. It shows clearly that there are features in the model which reflect entities and concepts, across different modalities and languages, which are geometrically near each other. That's about as latent space-y as it gets.

So I'll ask you, what evidence could convince you otherwise?

link

HarHarVeryFunny 491 days ago

Good question - I guess if the interpretability folk went looking for these sort of additive/accumulative representations and couldn't find them, that'd be fairly conclusive.

These models are obviously forming their own embedding-space representations for the things they are learning about grammar and semantics, and it seems that latent space-y representations are going to work best for that since closely related things are not going to change the meaning of a sentence as much as things less closely related.

But ... that's not to say that each embedding as a whole is not accumulative - it's just suggesting they could be accumulations of latent space-y things (latent sub-spaces). It's a bit odd if Anthropic haven't directly addressed this, but if they have I'm not aware of it.

link