Hacker News new | ask | show | jobs
by TeMPOraL 751 days ago
> The problem is that these models weren't trained to reason.

Except they kind of were. Specifically, they were trained to predict next tokens based on text input, with the optimization function being, does the result make sense to a human?. That's embedded in the training data: it's not random strings, it's output of human reasoning, both basic and sophisticated. That's also what RLHF selects for later on. The models are indeed forced to simulate reasoning.

> don't train it to do something else and then expect it to do the thing you didn't train it for.

That's the difference between AGI and specialized AI - AGI is supposed to do the things you didn't train it to do.

1 comments

I think people don’t recognize it’s currently doing single turn reasoning and demonstrating the building blocks of real time reasoning with continuous input.

If we tested humans on first thought questions and answers in 5 seconds or less on half the problems we did on LLMs — we might prove humans can’t reason as well