|
|
|
|
|
by Klathmon
532 days ago
|
|
So is the big improvement here simply skipping the unembedding/embedding step for internal thoughts? Or is it mainly in the training methods to teach the CoT and how to switch between "latent thought" and text output? It's really interesting that a fixed number of "latent thoughts" performed as well as a binary classifier! I didn't expect that at all, the way OpenAI talks about CoT it seems the ability to let it "keep thinking" let's them continually score higher on benchmarks while throwing eye watering amounts of compute at the inference. |
|