Hacker News new | ask | show | jobs
by cs702 1206 days ago
> my take away is that any LLM that can behave "good" must also be able to behave "badly"; philosophically, because it's not possible to encode "good" without somehow "accidentally" but unavoidably also encoding "bad/evil".

That's a really good non-technical summary of the OP's hypothesis. Thanks!