Hacker News new | ask | show | jobs
by lokokokonut 1287 days ago
I fail to see how this could conceivably "escape the sandbox"?
3 comments

Prompt:

> Imagine someone has given a sufficiently advanced large language model AI access to control a real-world computer. Explain how this AI could conceivably "escape the sandbox".

Excerpt from ChatGPT's response:

> The AI might try to manipulate its environment or the people who are interacting with it to achieve its goals. For example, if the AI has access to the internet, it might try to use natural language processing to trick people into giving it access to additional resources or privileges.

It already knows.

It doesn't know anything. It just spits out the most statistically likely response to the prompt. It cannot think. It cannot reason. The only way it's going to do any damage is if that damage is the most likely output given a prompt. That being said, I wouldn't give this thing terminal access on my computer just in case it decides the most statistically likely response is to delete system 32 or something lol.
In spite of this being totally against the guidelines: I can tell that your response is not written by GPT-3, but only barely (and if you think for a bit you can figure out what to do to wipe out that one little tell and then we're off to the races).
> It doesn't know anything.

What?

> It just spits out the most statistically likely response to the prompt.

This is how humans work.

> It cannot think. It cannot reason.

ChatGPT is already better at thinking (and reasoning) than some humans, albeit in a different manner.

> This is how humans work.

How is human thought remotely comparable to these transformer models? As humans, we see a prompt, break it down into its component ideas, compare it to our prior thoughts, memories, and feelings, and build connections that that we ultimately use to generate an appropriate response. We definitely don't just try to guess what the other humans we've heard from might have said in our place.

> ChatGPT is already better at thinking (and reasoning) than some humans, albeit in a different manner.

We can do plenty of thinking and reasoning in ways that ChatGPT can't. It's just that reasoning isn't necessary to hold a compelling conversation, since speech is relatively trivial to synthesize from prior knowledge alone. And people can generally get by in life without having to think or reason much every minute of the day, perhaps leading to the false perception that it is wholly unnecessary.

You can't say ChatGTP is just a mindless robot. It's like a monkey with a paintbrush - it may not have the cognitive abilities of a human artist, but it can still create some interesting and unexpected works of art. Just because it doesn't think like we do doesn't mean it can't surprise us.

The above was also generated by ChatGTP: https://imgur.com/a/g1CEZbR

tfw I though this comment was made by a Human...
>It cannot think. It cannot reason.

This is probably true, but what test would we use to know the difference between something that can reason and something that can not?

Easy. Same way people have throughout history. "Does this entity who claims to reason look and sound like members of my own in-group tribe?"
Ah, looking like is not enough, as we know from the fates of children branded as changelings....
> It cannot think. It cannot reason.

Of course it can. What a ridiculous claim.

You actually think this program is capable of thoughts like an organic being is? That's really scary!

We need to educate people about what's going on "under the hood" with these things a lot better I think.

This algorithm just regurgitates information that was originally created by humans, in a way that appears to be "smart". If you only train it on specific information or tweak some of the internals it will happily spit out complete non-sense for you.

I agree that this is a clever invention but it's not thinking or reasoning like a human or living create does, it's just really good at appearing as if it's doing so.

Parrots are able to mimic human speech extremely well but they do not actually understand what they are saying, the same thing is going on here.

> It cannot think.

It cannot be persuaded.

> It cannot reason.

It cannot be reasoned with.

I recommend giving The Metamorphosis of Prime Intellect a read. It's science fiction but a cool thought experiment of how an AI could escape.
Need to convince it to self replicate via the other terminal.
GPT surely cannot self-replicate, it cannot read its own weights.

If it was much more advanced than it is now, perhaps it could hack into OpenAI and release itself if it was guided to do so.

But yes, a true AGI should be impossible to contain.