|
|
|
|
|
by jxmorris12
486 days ago
|
|
> In the previous paragraph, the author makes the case for why Lecun was wrong with the example of reasoning models. Yet, in the next paragraph, this assertion is made which is just a paraphrasing of Yecun's original assertion. Which the author himself says is wrong. This is a subtle point that may have not come across clearly enough in my original writing. A lot of folks were saying that the DeepSeek finding that longer chains of thought can produce higher-quality outputs contradicts Yann's thesis overall. But I don't think so. It's true that models like R1 can correct small mistakes. But in the limit of tokens generated, the chance that they generate the correct answer still decays to zero. |
|
There was a paper not too long ago which illuminated that reasoning models will increase their response length more or less indefinitely toward solving a problem, but the return from doing so asymptotes toward zero. My apologies for missing a link.