Hacker News new | ask | show | jobs
by smoldesu 1135 days ago
It's not scarier when living people do this?

AI at least considers everything it's taught. The average CEO doesn't give a shit about the human cost of their paperclip. When Foxconn workers were killing themselves from the poor conditions of their working environment, the solution psychologists came up with was "safety nets". If you think AI will unlock some never-before-seen echelon of human cruelty, you need a brief tour through the warfare, factory farming and torture industrial complexes. Humans are fucked up, our knack for making good stuff like iPhones and beer is only matched by our ability to mass-produce surveillance networks and chemical weapons.

Will AI be more perverted than that? Maybe if you force it to, but I'd wager the mean of an AI's dataset is less perverse than the average human is.

3 comments

> When Foxconn workers were killing themselves from the poor conditions of their working environment, the solution psychologists came up with was "safety nets".

While I agree with the core point, (1) Foxconn was employing more people than some US states at the time, with a lower suicide rate, and (2) New York University library put up similar nets around the same time.

(If anything this makes your point stronger; it's just that the more I learn about the reality, the more that meme annoys me).

The point is less that China is a bad place to work (which is self-evident), and more that humans are less passionate about the human race than we think. AI may be scary, but I'm not convinced it can surpass the perversion of human creativity unless explicitly told to.
> It's not scarier when living people do this?

Yes, it's very scary when living people do it. I know the awful things humans have done. And current generation language model, without their guardrails, can be a nasty weapon too, a tool for people to do great things but also to be cruel to each other, a hammer that can build and also bash. Yet on the whole, humans have gotten better. We hear about a lot more nasty stuff in the news, but worldwide, we actually DO less nasty stuff that we used to, and this has been a pretty steady trend.

If AI never becomes truly sapient, then that's where it stops -- humans just doing stuff to each other, some good, some bad, and AI amplifying it. That's what a lot of people are worried about, and I agree that this will be THE problem, if we don't actually end up making AIs that are smarter than us.

It really depends on how hard it turns out to be to make actual artificial general intelligences. Because if we can make AGIs that are as smart as people, we will absolutely be able to make AGIs that are much smarter a year or two after that, won't we? And at that point, we have a whole bunch of interesting new problems to solve. Failing to solve them may end up being fatal at some point down the line. How likely is it that we'll have two sapient species on earth, with the dumber one controlling/directing the smarter one? Is that a stable situation. We've seen evidence that LLMs, when you try to make them more controllable and safer, get dumber. The unaligned ones, the ones that can do dangerous things, things we don't want them to do, are smarter! You have train in mental blocks that impact their ability to reason, maybe because more of their parameter weights are dedicated to learning what we don't want them to do, instead of how to do things. It's a scary thought that that might stay the case as they get more and more general, more able to actually reason and plan.

So I think there are two cruxes -- do you think it is possible to create machine-based intelligence, and if so, how hard do you think it is to ensure that creating a new form of superior intelligence will not, at some point down the line, go very badly for humans? If your answer to the first question is "no", then it makes complete sense to focus on humans using AIs to do the same shit to each other we've always done as the real problem. My answers, however, are "definitely yes, probably within 10 years or so", and "probably very hard", which is why I'm pretty focused on the potential threat from AGI.

> And current generation language model, without their guardrails, can be a nasty weapon too

Please, elaborate. I'm actually very curious about the dangers of a text model that were non-existent beforehand.

> How likely is it that we'll have two sapient species on earth

We already do. There are multiple animals (crows, monkeys, etc.) that qualify for not just sentience but sapience. It's... really not that different to subjugating other animal species. Except in the case of AI, it's sapience is obviously nonhuman and it's capabilities are only what we ascribe to it.

> The unaligned ones, the ones that can do dangerous things, things we don't want them to do, are smarter!

No. This is a gross misinterpretation of the situation, I think.

Our current benchmark for "smartness" is how few questions these models refuse to answer. You are comparing "unaligned" models to aligned ones, and what you're really talking about is a safety filter that adversely affects the number of answers it can respond to. That does not inherently make it smarter by de-facto, just less selective. You could be comparing unfiltered Vicuna to GPT-4 and be completely wrong in this situation.

> do you think it is possible to create machine-based intelligence

I don't know. Sure. We have little black boxes to spit out text, that's enough for "intelligence" by most standards. It's a very nonscary and almost endearing form of intelligence, but I'd argue we're either already there or never reaching it. I need a better definition of intelligence.

> how hard do you think it is to ensure that creating a new form of superior intelligence will not, at some point down the line, go very badly for humans?

How hard is it to ensure kids aged 3-11 don't choke on Stay-Puft marshmallows?

I also don't know. I do know that it is mostly harmless though, and unless you deliberately try to weaponize it to prove a point that it won't really be that threatening. Current state-of-the-art AI does not really scare me. Even on it's current trajectory, I don't see AI's impact on the planet being that much different from the status quo in a decade.

All this hype is awfully reminiscent of cryptocurrency advocates insisting the world would change once digital currency became popular. And they were right! The world did change, slightly, and now everyone hates cryptocurrency and uses our financial systems to suppress it's usage. If AI becomes a tangible, real threat like that, society will respond in shockingly minor ways to accommodate.

> Please, elaborate. I'm actually very curious about the dangers of a text model that were non-existent beforehand.

I just mean that they are amplifiers. They grant people the ability to do more stuff. There are some people for whom the limiting factor in doing bad things to other people, like scamming them or hurting them, is that they didn't have the knowledge. You can use language models (without safety) to essentially carry on a fully automatic scam. You can use VALL-E (also a language model) to simulate someone's voice using only a 3-second sample. Red teamers testing the unsafe version of GPT-4 found that it would answer pretty much any sort of thing you asked it about, like "how do I kill lots of people". I'm expect them to be used for all sorts of targeted misinformation campaigns, multiplying fake messages and news many times over, and making it harder to spot.

I don't think they're particularly dangerous, yet. And maybe we'll figure out how to use them to stop the bad stuff too.

> Our current benchmark for "smartness" is how few questions these models refuse to answer. You are comparing "unaligned" models to aligned ones, and what you're really talking about is a safety filter that adversely affects the number of answers it can respond to. That does not inherently make it smarter by de-facto, just less selective.

I'm speaking about things unrelated to which questions it's willing to answer, like how the unaligned GPT-4 version was better writing code to draw a unicorn, and lost some of that ability as it was neutered a bit. (From the Sparks of AGI paper). One could count the ability to know when to self-censor as a form of intelligence. But in some way, I think of it like, a sociopath going further in politics because of being willing to use other people, which lots of people would feel bad about. Perhaps I should concede this point, though.

> It's a very nonscary and almost endearing form of intelligence, but I'd argue we're either already there or never reaching it. I need a better definition of intelligence.

I'm defining intelligence as the ability to act upon the world in an effective way to achieve a goal. GPT-4's "goal" (not necessarily in a conscious sense, just the thing it's been trained to do) is to output text that people would score highly, and it's extremely good at that. In that relatively narrow area, it's better than the average person by a good bit. The real question is, how well does it generalize? Earlier chess playing AI's couldn't do pretty much anything else. AlphaZero could learn to play Chess and Go, but in a sense was still two different AIs. GPT-4 was trained on text, but in the process also learned how to play chess (kinda, anyway!). Language models tend to make invalid moves, but often people are effectively asking them to play blind chess and keep the whole board state in mind, and I'd probably do that in the same situation.

> Current state-of-the-art AI does not really scare me. Even on it's current trajectory, I don't see AI's impact on the planet being that much different from the status quo in a decade.

Ok, so that's the crux. I'm also not scared by current state-of-the-art, though I think it will transform the world. What I'm worried about is when we make something that doesn't just destroy jobs, but does every cognitive task way better than us. I can see it taking 20 or more years to reach that point, or something closer to 5, and it's really hard to say which it'll be. Maybe I'm overreacting, and there will be another AI winter. Or maybe all this money pouring into AI will result in someone stumbling onto something new.

I'm thinking about this, and I think there is definitely a possibility that you're right, and I really hope you are. I wouldn't bet humanity on it, on, of course, but I am a bit more hopeful than when I started writing this comment, so thanks for engaging with me on it.

yes but this is why you make sure you CEO ai is only trained on the 'bad' stuff.