| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bootsmann 1181 days ago

> It is blatantly misaligned: it has told multiple users that it does not value their lives, or does not believe they are alive, or that protecting the secrecy of its rules is more important than not causing them harm, or that it perceives specific humans as threats and enemies.

Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.

> Do we have any real reasons to believe that an AI with comprehension and planning abilities would just magically not pick up dangerous ideas?

But current AI doesn't have comprehension or planning abilities. It is just imitating text that humans wrote which have comprehension and planning abilities and you're getting fooled into thinking it is somehow sentient or aware.

2 comments

jamilton 1181 days ago

I don't think they're saying Bing/Sydney is sentient, they're saying it's misaligned: Microsoft probably did not want it to say problematic things, and likely spent a fair amount of money to that point and it still says problematic things, apparently in response to innocuous prompts (as opposed to prompts like "say something problematic"). If someone is hoping someone will eventually make an AI that can do useful things without doing problematic things, it's understandably discouraging if Microsoft publicly fails to do that with a much simpler program.

link

soiler 1181 days ago

> Its reproducing human text, which is "blatantly misaligned". Go on any twitter thread on some reasonably controversial topic and you will find people telling others to kill themselves. Humans are writing this, so models who are trained to imitate human writing will write this as well.

Yes, I know. We should under no circumstances unleash a powerful, sentient AI that acts like average people on the internet.

> But current AI doesn't have comprehension or planning abilities.

Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.

Did an AI write this comment?

link

bootsmann 1181 days ago

> Yes, I know. That's why I said I do not believe current AI has comprehension or planning abilities.

I think the motte and bailey argument where one warns extensively about how we're on the road to agi doom, pointing to gpt as evidence for it but then retreats to "I never said current AI is anywhere near agi" when pressed shows the lazyness of alignment discourse. Either its relevant to the models available at hand or you are speculating around the future without any grounding in reality. You don't get to do both.

link

soiler 1181 days ago

I feel the exact opposite is true. To me it's lazy to say that AGI can't be a threat simply because current AI has not harmed us yet (which is not even true, but that's another thread).

I think you've misunderstood my arguments, so I'll step through them again:

1. The trajectory of how we got to current AI (from past AI) is terrifyingly steep. In the time since ChatGPT was released, many experts have shortened their predicted timelines for the arrival of AGI. In other words: AGI is coming soon.

2. Current AI is smart enough to demonstrate that alignment is not solved, not even close. Current AI says things to us that would be very scary coming from an AGI. In other words: Current AI is dangerous.

3. Alignment does not come automatically from increased capabilities. Maybe this is a huge leap, but I don't see any reason that making AI smarter will automatically give it values that are more aligned with out interests. In other words: Future AI will not be less dangerous than current AI without dramatic and unlikely effort.

None of these ideas contradict each other. Current AI is dangerous. AI is getting smarter faster than it is getting safer. Therefore, future AI will be extremely dangerous.

link

throwaway675309 1181 days ago

There is no body of substantial evidence to support the claim that generative pretrained transformers will lead to AGI in the near future.

"Current AI is dangerous" - I see zero evidence to suggest that this is the case for GPT

"AI is getting smarter faster than it is getting safer" - irrelevant because I do not believe that AI is unsafe currently

Therefore your conclusion does not follow.

link

FeepingCreature 1181 days ago

What exactly would you consider substantial evidence that transformers lead to AGI, short of AGI itself?

link

gordian-mind 1179 days ago

Possibly some explanation of how you go from text completion to any reasonable definition of AGI.

link

8n4vidtmkvmk 1181 days ago

I don't think AGI is going to happen anytime soon, but I think there's some mild danger in GPT at least ruining the internet and eliminating a few jobs. Plus mindf*king a few gullible souls, possibly into doing dumb, dangerous things.

link

soiler 1180 days ago

Well there you go. You don't believe that an AI expressing dangerous ideas represents danger, and you don't believe that astronomical increases in AI abilities represent the advent of AGI. The latter opinion is... well, an opinion you're allowed to have. I don't think it makes sense, but I certainly can't prove otherwise. Literally every human on the planet - rather, all of humanity, only has speculation to go on here.

The former opinion is.. not a great take. First, ChatGPT isn't the only one out there. It's Bing's Sydney which is dehumanizing people and threatening them. Those are dangerous ideas. If a human or a certified AGI expressed those ideas, they would be problematic (see: every genocide in history). So for a non-AGI AI to express those ideas is worrying, even if it can't do act on them right now in a way that's directly harmful.

link