|
|
|
|
|
by sensanaty
312 days ago
|
|
Isn't this worse, though? You said it yourself, it's an even blacker black box that nobody truly understands. And we even have an attempt at setting rules for these things akin to Asimov's laws, those "system prompts" that people are fascinated by are testament to that and they tend to be THOUSANDS of statements long (and they're not very good at following them). People often misunderstand Asimov's laws, the entire point of the laws and the stories they're set in was that you can't just throw a simple "Don't hurt people" clause at a black box like AI and expect good results. You first have to define "Don't", then you have to define "hurt" and perhaps the hardest of all is you have to define "people". And I mean really define it, to the smallest most minute detail of what exactly all those words mean. Otherwise you very quickly run into funny, tragic and even contradictory situations, and those situations are endlessly unique. Is feeding grossly unhealthy food to a starving person harm? Perhaps not, you can argue it's better to eat something unhealthy than to starve. What about feeding someone on the brink of a cardiac arrest that same meal? Now what about all the other gray areas involved here, you have to define every single possible situation in which an unhealthy meal might affect someone. It's kinda funny, because it really is almost prophetic considering it's a story written quite a long time before we were even close to it being a reality... |
|
There simply IS no explicit definition for "people", "hurt" or "don't" inside an LLM that you could found such hard constraints on.
Note that we never found a way to "program" such constraints into a human mind either, we probably/hopefully never will, and I think that whole approach ("simple, hard deterministic constraints") is just never gonna work for AI; so Asimovs rule framework is just not really applicable.