|
|
|
|
|
by sardukardboard
38 days ago
|
|
A funny Goodhart’s Law parallel showed up in during GPT-5.1 training, where the model was rewarded for using the web search tool, so it learned the behavior of superficially using web search to calculate “1 + 1” and not utilize the result. https://alignment.openai.com/prod-evals/ |
|