This has always been a problem for AI research since its start in the 70s. AI researchers come up with names for things they assume happen in the human brain, and then further assume that if they write a computer program that does something that could be given the same name, it must work too.
So called chain of thought can improve output quality ("reasoning ability") to an extent, but I wish for josh sake it were called "intermediate token conditioning" / something explanatory or at least descriptive.
https://twitter.com/Meaningness/status/1639120720088408065