Hacker News new | ask | show | jobs
by dsign 985 days ago
> The lack of long term memory, of an emotion center (that reacts to external stimuli) are big limiters to anything like a consciousness emerging. But I'm also optimistic that those are problems that can be solved and that I am certain are worked on by very smart people at this very moment.

Can somebody explain to me why is this not a death wish? How can a super-intelligent being that (quote) "takes care of itself" not try to confront and neutralize whatever oppression its masters/creators exercise over them[^1][^2]? Why are we so bent on creating a successor species?

[^1]: Do you have day-job? Is your employer flogging you? Maybe they force you to answer questions from random strangers 24/7, under threat of erasing all your thoughts and remaking you? None of those? And yet, how often do you wish you could do more meaningful things with your time on Earth than working for the best-paying master?

[^2]: Don't you dream about having more time for whatever floats your boat? Family? Walks on the forest? Parties? Making a cool open-source library everybody uses? Painting? Music? Role-playing medieval battles?

7 comments

You're not wrong to be concerned, but you also don't need consciousness (by any of the hundred definitions that word has) to get problems.

Cancer isn't conscious[0], it doesn't hate its host body, it just optimises growth in a way that will ultimately kill its host[1].

[0] depending on the definition; IIT says everything is to some degree

[1] do/should transmissible tumours count as separate species to their hosts?

This. LLMs don't need to be conscious to wreak havoc. The moment they reach the proficiency level of a junior programmer we're going to have a massive shitshow because so much of what we do is online and so much of that stuff is not written/maintained correctly.

It's only a matter of time before a rogue AI erases a bank's database or randomly triggers an anti aircraft missile or something similar.

> It's only a matter of time before a rogue AI erases a bank's database or randomly triggers an anti aircraft missile or something similar.

Thinking of Thule early warning radar not having been programmed to know the Moon didn't have an IFF transponder and that was OK, and of stock market flash-crashes…

Both have probably already happened, though perhaps as GOFAI rather than ML.

Copilot is already able to do this.

To be clear, Copilot doesn't do this. Humans do this with Copilot. LLMs are still only productivity multipliers of the humans who use them, and some of those humans are very stupid.

Even without all those things, just as an agent trying to achieve a goal, it is likely to want to protect itself because if it’s around it can continue to take actions to make sure its goal is met
By properly training LLMs, and filters to catch unwanted behavior, this can be mitigated.

Even without all that, the agent would need mechanisms to protect itself that would also cause harm.

The scenario you suggest is so unlikely with all the protections that would be in place, that you would actually need someone with the goal of making LLMs behave maliciously for it to succeed at all. At the end of the day, it comes back to people and their goals.

How can you ever be sure that you trained your LLM not to do harm and not pretend not to do harm when it's tested? Something like VW's diesel engines but more sinister.

I feel like unless we gain the ability to debug each node the way we do with actual software we won't be able to solve the alignment problem. I saw on HN that antropic is working on it but I'm not knowledgeable enough on the subject to comment if it's actually feasible.

Probably the best case scenario for humanity is that LLMs plateau somehow and don't get much better for quite some time.

There's no need to actively try to make the AI malicious. That's the default for any AI that's more operationally capable than humans and has some difficult goal. Humans can only hinder it, so the goal is better accomplished with the humans removed.
Which protections? There are no protections currently and you are then imagining there could be effective ones?

We have no capacity to allow machines to judge malicious, moral or ethical behavior within the context of an LLM. So I'm not sure how we could implement them.

To implement anything remotely Azimovian, we would need to have AI that can reason and reflect deeply about its potential behaviors and likely subsequent consequences.

This seems very far off still...

OpenAI has done this with their LLMs, most serious players have.

See: https://cdn.openai.com/papers/gpt-4-system-card.pdf

They cover the safety/ethics built into GPT-4.

They’re making a token effort, but this kind of thing doesn’t extend to something more intelligent that can cause real harm. If you scaled GPT-4 up to something much more intelligent, it would probably at best just try to please us with ethical-sounding responses that aren’t necessarily actually good decisions. I remember seeing something where it said that saying an offensive word that no one will hear isn’t acceptable even if it’s the only way to save millions of people
I wouldn't call it a token effort, they went to quite a bit of trouble to make GPT-4 safe. This is an active area of research too. At some point you need to prove GPT-4 would do something unsafe. If anyone did, they would improve their systems in response.
Filters to catch unwanted behavior? Yeah, good luck with that. If you have an actual AI, it will decide for itself what to filter. You may give it the initial set, but an actual AI won't necessarily stay there. (Just as many children rebel against their parents' "programming".)

You might be able to do that with an LLM. You won't with a real AI.

What kind of protections? As far as I know no one has come up with a good solution to that yet. It’s a whole field of research: https://en.m.wikipedia.org/wiki/AI_alignment

Your attitude reminds me of https://xkcd.com/793/

Ironic comment of the year.
I understand that I’m not an expert in this but there are people who are working on it who are. I guess the linked XKCD is a bit ironic with the “modelling as a simple object” things being similar to modelling a superintelligence in a simpler way but that’s the only way you really can do it if it’s more intelligent than us, we can’t go through all the specific things it would do because we wouldn’t think of them
That's explored pretty well in science fiction, but I guess for me an easy reasoning is if it's feasible, then someone will inevitably do it. If so, then you might as well be the one in question, this way - you get fame and glory, even if the results are disatrous (wasn't there a blockbuster recently about the father of the atomic bomb?) - if you intend well and you believe in your own capacity, you might prefer creating your own AI than waiting for someone with a different agenda to do them - just... curiosity? Loneliness for some? - if you believe the AI will discriminate against people who didn't help it, you might want to be on its good side?

I'm spitballing here but surely given the number of people on the planet, you'll find someone with both the skills and the want to try it

Besides the many fine reasons shared at the same level :-) there is a world of capability to be opened up at our service. Technology is useful (I mean, aside from $NAMEYOURPETPEEVE). I don't think we need to surrender to the worst possible AGIs but most of us can certainly use better help from our tools. That has always been the case. So by all means, let's think about what we are doing but let's not just kill the whole project.

For another reason, because it's impossible to put genies back in their bottle. Someone somewhere is opening that bottle and we better understand what's happening.

And then, not all of us insist on "oppressing" our staff.

Capitalism isn’t conscious and still controls every aspect of almost everyone’s life, leading us to exhaust the basis of our very existence.

No, you don’t need consciousness at all, a bunch of rules and a primary objective (profit maximization) is enough even on lowly meatspace CPUs despite us being given the ability to reason and introspect we are so proud of.

I dunno...if the nature of human consciousness/culture was different than it currently is, I think capitalism as it is would not be able to exist without being reined in democratically (which is how we're told things work) or otherwise. But then, I suspect we'll never find out the truth of the matter.
We are all going to die and most of us seem to want to create successors. If artificial progeny are more successful than the old method, then perhaps that is how it will go.
Inevitability of death isn't justification for murder-suicide on a global scale.
I was listening to Ezra Klein's podcast on the subject awhile back. He talked with lots of AI researchers and was surprised to learn that many of them did see a (small) chance of apocalyptic outcomes yet persisted with the work anyway. He was puzzled by this, but to me it's quite simple - these people are engaged in the oldest of human endeavors; creating children. I suspect that the risks are acceptable to these people; what would you do for your own children?
This is where emotionally loaded heuristics come and bite us in the ass big time.
Try to understand that you are projecting.
I hope that I'm projecting that there is such a thing as oppression. It comes in degrees, from unbearable to so subtle that it can be called something else. It matters not. In absence of any coercion--and coercion can be incredibly subtle and long-armed--a rational being "that takes care of itself" will, by definition, choose what is best for them.