| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by WhiskeyChicken 940 days ago
	Is there a specific reason we should expect that "instructing not to perform" an illegal activity should result in it adhering to said instruction? Is this any different than when it provides wrong output about other things, even when the operator attempts to "engineer" the prompt to guide the result?

1 comments

nerdponx 940 days ago

I'd be curious what would happen given RLHF to try to penalize illegal/immoral/unethical activity.

I had always dismissed Asimov's "rules of robotics" as silly: nobody would ever design a mission-critical robot with indeterminate stochastic behavior! Maybe I should reconsider and re-read those stories, because people seem very eager to do just that.

link

staunton 940 days ago

People will most definitely build such things (also into autonomous swarms of killer robots usedby the military, projects are ongoing...). However, Asimov's stories illustrate how difficult it is to find such rules. They are certainly not meant for inspiration how to actually program robots...

link