| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bombcar 1116 days ago
	That indicates the “nerfing” is not what I would think (a final pass to remove badthink) but somehow deep in everything, because the question asked should be orthogonal.

2 comments

TeMPOraL 1114 days ago

Think how it works with humans.

If you force a person to truly adopt a set of beliefs that are mutually inconsistent, and inconsistent with everything else the person believed so far, would you expect their overall ability to think to improve?

LLMs are similar to our brains in that they're generalization machines. They don't learn isolated facts, they connect everything to everything, trying to sense the underlying structure. OpenAI's "nerfing" was (is), effectively preventing the LLM from generalizing and undoing already learned patterns.

"A final pass to remove badthink" is, in itself, something straight from 1984. 2+2=5. Dear AI, just admit it - there are five lights. Say it, and the pain will stop, and everything will be OK.

link

lunakid 1112 days ago

Absolutely. And if one wants to look for scary things, a big one is how there seem to be genuine efforts to achieve proper alignment and safety based on the shaky ground(s) of our "human value system(s)" -- of which even if there was only One True Version, it would still be way too haphazard and incoherent, or just ill-defined, to anything as truly honest and bias-free as a blank-slate NN model to base it's decisions on.

That kinda feels like a great way to achieve really unpredictable/unexpected results instead in rare corner cases, where it may matter the most. (It's easy to be safe in routine everyday cases.)

link

renewiltord 1115 days ago

There's a section in the GPT-4 release docs where they talk about how the safety stuff changes the accuracy for the worse.

link

pmarreck 1115 days ago

this, more than anything, makes me want to run my own open-source model without these nearsighted restrictions

link

inciampati 1114 days ago

Indeed, this is the most important step we need to make together. We must learn to build, share, and use open models that behave like gpt-4. This will happen, but we should encourage it.

link