|
|
|
|
|
by ACCount37
215 days ago
|
|
A/B testing is radioactive too. It's indirectly optimizing for user feedback - less stupid than directly optimizing for user feedback, but still quite dangerous. Human raters are exploitable, and you never know whether the B has a genuine performance advantage over A, or just found a meat exploit by an accident. It's what fucked OpenAI over with 4o, and fucked over many other labs in more subtle ways. |
|