|
|
|
|
|
by XenophileJKO
62 days ago
|
|
Don't or can't. My assumption is the model no longer actually thinks in tokens, but in internal tensors. This is advantageous because it doesn't have to collapse the decision and can simultaneously propogate many concepts per context position. |
|
Separately, I think Anthropic are probably the least likely of the big 3 to release a model that uses latent-space reasoning, because it's a clear step down in the ability to audit CoT. There has even been some discussion that they accidentally "exposed" the Mythos CoT to RL [0] - I don't see how you would apply a reward function to latent space reasoning tokens.
[0]: https://www.lesswrong.com/posts/K8FxfK9GmJfiAhgcT/anthropic-...