Ask HN: Can GPT improve itself ala AlphaGo Zero?

Y	Hacker News new \| ask \| show \| jobs

2 points by vermorel 1223 days ago

Yann LeCun is making the case that generative models are fundamentally divergent: at every token, there is a probability of getting something wrong, and errors accumulate exponentially over the number of generated tokens.

I tend to agree with the premise, however, what if the generative process is overlaid with an "inner debate", as a substitute to having the model play against itself, ala AlphaGo Zero?

The sequence of prompts would go:

1. Please explain X

2. Criticize your explanation for X, use reason and logic.

3. Based on your own critics, improve your explanation of X.

I have manually toyed with this approach (the prompts are longer, you get the gist), and it gives very interesting results. This could lead to GPT re-create, on its own, a better high-quality corpus to learn from.

Is anybody pursuing this approach for LLM?

1 comments

senko 1223 days ago

The thing with AlphaGo Zero is that there is a clear external arbiter of which side of the internal debate wins, so the algorithm can learn.

For LLM to use the technique on the kind of reasoning you talk about, you need a human in the loop to explain it why it's wrong or right, otherwise it just hallucinates random stuff.

That's basically what RLHF[0] is, which was used to great success in training ChatGPT.

[0] https://huggingface.co/blog/rlhf

link

vermorel 1223 days ago

Thanks! The interesting thing is that my casual observations indicate that GPT itself might already be good enough to self-arbiter itself. Just like a human writer can improve its own writing by iterating over it. In a sense, having humans in the loop were what it took (past) to gain the possibility to reach self-arbitration capacity.

link