|
|
|
|
|
by TeMPOraL
1115 days ago
|
|
Think how it works with humans. If you force a person to truly adopt a set of beliefs that are mutually inconsistent, and inconsistent with everything else the person believed so far, would you expect their overall ability to think to improve? LLMs are similar to our brains in that they're generalization machines. They don't learn isolated facts, they connect everything to everything, trying to sense the underlying structure. OpenAI's "nerfing" was (is), effectively preventing the LLM from generalizing and undoing already learned patterns. "A final pass to remove badthink" is, in itself, something straight from 1984. 2+2=5. Dear AI, just admit it - there are five lights. Say it, and the pain will stop, and everything will be OK. |
|
That kinda feels like a great way to achieve really unpredictable/unexpected results instead in rare corner cases, where it may matter the most. (It's easy to be safe in routine everyday cases.)