Hacker News new | ask | show | jobs
by redox99 1184 days ago
GPT6 or whatever will always require alignment, as the base model just blindly predicts next token, instead of being a helpful, chat style assistant.

Right now the best way to align it is with RLHF. The specific technique might change, but in the end there will always be at some level some human input that tells it how it should behave. Newer techniques might further leverage LLMs and require fewer human input.

Could you use GPT4 to align GPT6? Yes. But you should expect GPT6 to inherit the alignment of GPT4, i.e if RLHF taught GPT4 that it it's OK to roast Trump, but not Biden, you would expect such GPT6 to act the same way.

Having said that, I'm sure there will interesting ways in which GPTn will help train GPTn+1. Some kind of self play in which it reasons and further improves itself seems obvious long term.

But human input that tells it "this is politically correct, this is not, so don't say that" will always be required as it's subjective. You can reuse it of course, but I don't see how it would "improve" without further human input.

1 comments

you don't need humans in the loop for alignment. rlaif is a thing and is used for the anthropic models (claude)
is it really being used for the final model ? i know they have research papers out on it...but wasnt sure if the production models used it.
Yeah it is.