|
|
|
|
|
by marcosdumay
290 days ago
|
|
They shouldn't have put it there to start with. Now unhealthy people are complaining about an environment change. Anyway, that one complaint doesn't mean they did the wrong thing. And also, there are unrelated complaints of "GPT-5 can't solve the same problems 4 did". Those were very real too, and meant OpenAI did a wrong thing. |
|
Correct, but that's true for all bugs.
In this case, the deeper bug was the AI having a training reward model based too much on user feedback.
If you have any ideas how anyone might know what "too much" is in a training reward, in advance of trying it, everyone in AI alignment will be very interested, because that's kinda a core problem in the field.