Hacker News new | ask | show | jobs
by swagempire 985 days ago
Which protections? There are no protections currently and you are then imagining there could be effective ones?

We have no capacity to allow machines to judge malicious, moral or ethical behavior within the context of an LLM. So I'm not sure how we could implement them.

To implement anything remotely Azimovian, we would need to have AI that can reason and reflect deeply about its potential behaviors and likely subsequent consequences.

This seems very far off still...

1 comments

OpenAI has done this with their LLMs, most serious players have.

See: https://cdn.openai.com/papers/gpt-4-system-card.pdf

They cover the safety/ethics built into GPT-4.

They’re making a token effort, but this kind of thing doesn’t extend to something more intelligent that can cause real harm. If you scaled GPT-4 up to something much more intelligent, it would probably at best just try to please us with ethical-sounding responses that aren’t necessarily actually good decisions. I remember seeing something where it said that saying an offensive word that no one will hear isn’t acceptable even if it’s the only way to save millions of people
I wouldn't call it a token effort, they went to quite a bit of trouble to make GPT-4 safe. This is an active area of research too. At some point you need to prove GPT-4 would do something unsafe. If anyone did, they would improve their systems in response.