Hacker News new | ask | show | jobs
by Jensson 947 days ago
A big difference between a game like Go and writing text is that text is single player. I can write out the entire text, look at it and see where I made mistakes on the whole and edit those. I can't go back in a game of Go and change one of my moves that turned out to be a mistake.

So trying to make an AI that solves the entire problem before writing the first letter will likely not result in a good solution while also making it compute way too much since it solves the entire problem for every token generated. That is the kind of AI we know how to train so for now that is what we have to live with, but it isn't the kind of AI that would be efficient or smart.

1 comments

This doesn't seem like a major difference, since LLMs are also choosing from a probability distribution of tokens for the most likely one, which is why they respond a token at a time. They can't "write out' the entire text at a time, which is why fascinating methods like "think step by step" work at all.
But it can't improve its answer after it has written it, that is a major limitation. When a human writes an article or response or solution, that is likely not the first thing the human thought of, instead they write something down and works on it until it is tight and neat and communicates just what the human wants to communicate.

Such answers will be very hard for an LLM to find, instead you mostly get very verbose messages since that is how our current LLM thinks.

Completely agree. The System 1/System 2 distinction seems relevant here. As powerful as transformers are with just next-token generation and context, which can be hacked to form a sort of short-term memory, some time of real-time learning + long-term memory storage seems like an important research direction.
> But it can't improve its answer after it has written it, that is a major limitation.

It can be instructed to study its previous answer and find ways to improve it, or to make it more concise, etc, and that is working today. That can easily be automated by LLMs talking to each other.

that is true and isnt. GPT4 has shown itself to halfway through a answer say "wait thats not correct im sorry let me fix that" and then correct itself. For example it stated a number was prime and why, and when showing the steps found it was divisible by 3 and said "oh i made a mistake it actually isnt prime"