|
|
|
|
|
by sebzim4500
1208 days ago
|
|
Why does there need to be a way out? Everyone just seems to assume that feeding model output into the training set is going to break things, but I don't get why. AlphaZero learned to play chess and go training purely on its own data. Why is inserting the best outputs from GPT-4 into the training set for GPT-5 expected to make things worse? To me, it sounds like it could even be desirable. |
|
You really don't throw two sentences into the thunder dome to decide which one "wins". Means it's much more susceptible to being poisoned.