|
|
|
|
|
by meroes
404 days ago
|
|
I might just be on the opposite side of the aisle, but to me chain-of-thought is better understood as simply more context. Of course there is ambiguity though, more context would be hard to distinguish from core-reasoning and vice versa. I think LLMs/AI mean we can substitute reasoning with vast accumulations and relations between contexts. Remember, RLHF gives the models some, and perhaps most of these chains-of-thought, when there isn’t sufficient text to scrape for each family of problems. When I see that chain-of-thought, the first thing I think of is of my peers who had write, rewrite, nudge, and correct these chains of thought, and not about core reasoning. The CoT has that same overexplained step-by-step so many RLHF’ers will be accustomed to, and much of it was authored/originated by them. And due to the infinite holes it feels like plugging, I dont call that RL reasoning. |
|