| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by meroes 404 days ago

I might just be on the opposite side of the aisle, but to me chain-of-thought is better understood as simply more context.

Of course there is ambiguity though, more context would be hard to distinguish from core-reasoning and vice versa.

I think LLMs/AI mean we can substitute reasoning with vast accumulations and relations between contexts.

Remember, RLHF gives the models some, and perhaps most of these chains-of-thought, when there isn’t sufficient text to scrape for each family of problems. When I see that chain-of-thought, the first thing I think of is of my peers who had write, rewrite, nudge, and correct these chains of thought, and not about core reasoning.

The CoT has that same overexplained step-by-step so many RLHF’ers will be accustomed to, and much of it was authored/originated by them. And due to the infinite holes it feels like plugging, I dont call that RL reasoning.