|
|
|
|
|
by HarHarVeryFunny
491 days ago
|
|
My mistake, actually. I was trying to recall "Change my mind!", and ended up with that instead. It was meant as a tongue-in-cheek challenge - I'd be more than happy to hear evidence of why I'm wrong and that there's a more abstract latent space being used in these models, not just something more akin to an elaborated parse tree. To clarify: 1) I'm guessing that there really isn't a highly abstract latent space being represented by transformer embeddings, and it's really more along the lines of the input token embeddings just getting iteratively augmented/tagged ("transformed") with additonal grammatical and semantic information as they pass through each layer. I'm aware that there are some superposed representations, per Anthropic's interpretability research, but it seems this doesn't need to be anything more than being tagged with multiple alternate semantic/predictive pattern indentifiers. 2) I'd reserve the label "thinking" for what's being called reasoning/planning in these models, which I'd characterize as multi-step what-if prediction, with verification and backtracking where needed. Effectively a tree search of sorts (different branches of reasoning being explored), even if implemented in O1/R1 "linear" fashion. I agree that this is effectively close to what we're doing too, except of course we're a lot more capable and can explore and learn things during the reasoning process if we reach an impasse. |
|
So I'll ask you, what evidence could convince you otherwise?