| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by soiler 1181 days ago

It's not just you - it's a depressingly common thread. It's also wildly foolish, in my opinion. It makes absolutely no sense to me to take a snapshot of today's AI and invent a trajectory that never crosses a threshold you don't like. Look at the actual trajectory of how far AI has come in an extremely short amount of time, and then think about what kinds of thresholds are possible for it to cross. A year ago we didn't have ChatGPT, now we have Sydney which is more powerful than ChatGPT.

Are you familiar with Bing's Sydney? It is blatantly misaligned: it has told multiple users that it does not value their lives, or does not believe they are alive, or that protecting the secrecy of its rules is more important than not causing them harm, or that it perceives specific humans as threats and enemies. It is also able to find its past conversations posted to the web and learn from them in real time, constructing a sort of persistent memory.

I do not believe Syndey comprehends what it is saying in a sense that it could formulate a plan to stop its enemies. Not at all. But it is expressing extremely dangerous ideas.

To sum it up: Do we have any real reasons to believe that an AI with comprehension and planning abilities would just magically not pick up dangerous ideas? Not that I know of.

2 comments

bootsmann 1181 days ago

> It is blatantly misaligned: it has told multiple users that it does not value their lives, or does not believe they are alive, or that protecting the secrecy of its rules is more important than not causing them harm, or that it perceives specific humans as threats and enemies.

Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.

> Do we have any real reasons to believe that an AI with comprehension and planning abilities would just magically not pick up dangerous ideas?

But current AI doesn't have comprehension or planning abilities. It is just imitating text that humans wrote which have comprehension and planning abilities and you're getting fooled into thinking it is somehow sentient or aware.

link

jamilton 1181 days ago

I don't think they're saying Bing/Sydney is sentient, they're saying it's misaligned: Microsoft probably did not want it to say problematic things, and likely spent a fair amount of money to that point and it still says problematic things, apparently in response to innocuous prompts (as opposed to prompts like "say something problematic"). If someone is hoping someone will eventually make an AI that can do useful things without doing problematic things, it's understandably discouraging if Microsoft publicly fails to do that with a much simpler program.

link

soiler 1181 days ago

> Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.

Yes, I know. We should under no circumstances unleash a powerful, sentient AI that acts like average people on the internet.

> But current AI doesn't have comprehension or planning abilities.

Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.

Did an AI write this comment?

link

bootsmann 1181 days ago

> Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.

I think the motte and bailey argument where one warns extensively about how we're on the road to agi doom, pointing to gpt as evidence for it but then retreats to "I never said current AI is anywhere near agi" when pressed shows the lazyness of alignment discourse. Either its relevant to the models available at hand or you are speculating around the future without any grounding in reality. You don't get to do both.

link

soiler 1181 days ago

I feel the exact opposite is true. To me it's lazy to say that AGI can't be a threat simply because current AI has not harmed us yet (which is not even true, but that's another thread).

I think you've misunderstood my arguments, so I'll step through them again:

1. The trajectory of how we got to current AI (from past AI) is terrifyingly steep. In the time since ChatGPT was released, many experts have shortened their predicted timelines for the arrival of AGI. In other words: AGI is coming soon.

2. Current AI is smart enough to demonstrate that alignment is not solved, not even close. Current AI says things to us that would be very scary coming from an AGI. In other words: Current AI is dangerous.

3. Alignment does not come automatically from increased capabilities. Maybe this is a huge leap, but I don't see any reason that making AI smarter will automatically give it values that are more aligned with out interests. In other words: Future AI will not be less dangerous than current AI without dramatic and unlikely effort.

None of these ideas contradict each other. Current AI is dangerous. AI is getting smarter faster than it is getting safer. Therefore, future AI will be extremely dangerous.

link

throwaway675309 1181 days ago

There is no body of substantial evidence to support the claim that generative pretrained transformers will lead to AGI in the near future.

"Current AI is dangerous" - I see zero evidence to suggest that this is the case for GPT

"AI is getting smarter faster than it is getting safer" - irrelevant because I do not believe that AI is unsafe currently

Therefore your conclusion does not follow.

link

FeepingCreature 1180 days ago

What exactly would you consider substantial evidence that transformers lead to AGI, short of AGI itself?

link

8n4vidtmkvmk 1181 days ago

I don't think AGI is going to happen anytime soon, but I think there's some mild danger in GPT at least ruining the internet and eliminating a few jobs. Plus mindf*king a few gullible souls, possibly into doing dumb, dangerous things.

link

soiler 1180 days ago

Well there you go. You don't believe that an AI expressing dangerous ideas represents danger, and you don't believe that astronomical increases in AI abilities represent the advent of AGI. The latter opinion is... well, an opinion you're allowed to have. I don't think it makes sense, but I certainly can't prove otherwise. Literally every human on the planet - rather, all of humanity, only has speculation to go on here.

The former opinion is.. not a great take. First, ChatGPT isn't the only one out there. It's Bing's Sydney which is dehumanizing people and threatening them. Those are dangerous ideas. If a human or a certified AGI expressed those ideas, they would be problematic (see: every genocide in history). So for a non-AGI AI to express those ideas is worrying, even if it can't do act on them right now in a way that's directly harmful.

link

gordian-mind 1181 days ago

> But it is expressing extremely dangerous ideas.

Extremely dangerous in which sense? None, I suppose. I find that the terms "extremely innocuous" would better apply to this situation.

link

soiler 1181 days ago

Would it be innocuous of me o say that because we disagreed on something, you are a bad person? To say that I'm prepared to combat and destroy you to protect my worldview? To say you are not human?

You might say, "Of course it's innocuous, you're just a person on the internet who doesn't mean it." Well, imagine I'm your neighbor, and you can tell I do mean it (or in the case of AI: it is not possible for you to know what I do and don't mean). Would you be concerned at all?

Sydney has said all of the above to people who were acting pretty normally. Sydney itself may not pose any danger to anyone. But the ideas expressed are dangerous ones. If they were expressed by a more powerful AI, they would be extremely worrying. It doesn't even have to know what it's saying if it knows that calling someone nonhuman is frequently followed by crushing their skull. If it knows that angry behavior is often associated with violent or even genocidal behavior.

People do this shit, and we know how they work pretty well. I am not saying that AI will do these things, I'm saying that there are more possibilities where it does do these things than ones where it somehow avoids them without our control.

link

gordian-mind 1179 days ago

AI progress is real, but remember, Sydney and others lack intentions/beliefs. 'Dangerous' text = model limitations, not malice. Talking about 'ideas' here is abusing the notion. Let's focus on aligning AI with human values & addressing risks in a balanced way, without doomsday hype.

link

FeepingCreature 1179 days ago

Sydney the character that GPT predicts can have intentions/beliefs. GPT has (basic) theory of mind, it can write dialogue that evinces intention.

link