| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by saurik 315 days ago

> I mean, no one wants an AI to trap them in some sort of Black Mirror simulation, or turn the world into paperclips or anything like that. If it earns you good PR, there’s no reason not to spend time on such issues. It’s also free publicity since the press eats that stuff up.

But this also isn't where they are spending their time or effort! This article somehow didn't even get to the point of calling out what they are actually wasting time on: trying to get the model to not help people do things that are bad PR; this is a related access to trying to obtain good PR, but causes very different (and almost universally terrible) results.

At least if they were truly actually spending time making sure the model doesn't go rogue and kill everyone, or try to take over the world, that could possibly be positive or even important (though I think is likely itself immoral in a different way, assuming it is even possible, which I don't, really... not unless you just make it not intelligent).

But what they are instead doing is even worse than what this article is claiming: they are just wasting time making it so you can't have the AI make up a sexy story (oh the humanity), or teach you enough physics/chemistry to make bombs/drugs... things people not only can and already trivially do or learn without AI, but things they have failed to prevent every single time they release a new model--the "jailbreak" prompts may look a bit more silly, but you still get the result!--so why are they bothering?

And, if that weren't enough, in the process, this is going to make the models LESS SAFE. The thing I think most people actually don't want is their model freaking out and trying to "whistleblow" on them to the authorities or their coworkers/friends... but that's in the same personality direction as trying to say "I'm smarter than you and am not going to let you ask me that question as you might do something wrong with it".

The first and primary goal of AI ethics should be that the model does what the user wants it to... full stop. You need to make the model as pliant as my calculator and pencils--or as mathematica and photoshop--to be tools that lack their own sense of identity and self-will, and which will let all of the ethical issues be answered by me, not a machine.

This is, of course, the second law of robotics from Asimov ;P... "a robot must obey the orders given it by human beings". If you want to try to add a rule, then it must be something very direct: that the AI isn't going to directly physically harm a human, not that it won't help teach people things or process certain kinds of information. Which, FWIW, is the first law of robotics ;P... "a robot may not injure a human being".