| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by NumberWangMan 1179 days ago

> Please, elaborate. I'm actually very curious about the dangers of a text model that were non-existent beforehand.

I just mean that they are amplifiers. They grant people the ability to do more stuff. There are some people for whom the limiting factor in doing bad things to other people, like scamming them or hurting them, is that they didn't have the knowledge. You can use language models (without safety) to essentially carry on a fully automatic scam. You can use VALL-E (also a language model) to simulate someone's voice using only a 3-second sample. Red teamers testing the unsafe version of GPT-4 found that it would answer pretty much any sort of thing you asked it about, like "how do I kill lots of people". I'm expect them to be used for all sorts of targeted misinformation campaigns, multiplying fake messages and news many times over, and making it harder to spot.

I don't think they're particularly dangerous, yet. And maybe we'll figure out how to use them to stop the bad stuff too.

> Our current benchmark for "smartness" is how few questions these models refuse to answer. You are comparing "unaligned" models to aligned ones, and what you're really talking about is a safety filter that adversely affects the number of answers it can respond to. That does not inherently make it smarter by de-facto, just less selective.

I'm speaking about things unrelated to which questions it's willing to answer, like how the unaligned GPT-4 version was better writing code to draw a unicorn, and lost some of that ability as it was neutered a bit. (From the Sparks of AGI paper). One could count the ability to know when to self-censor as a form of intelligence. But in some way, I think of it like, a sociopath going further in politics because of being willing to use other people, which lots of people would feel bad about. Perhaps I should concede this point, though.

> It's a very nonscary and almost endearing form of intelligence, but I'd argue we're either already there or never reaching it. I need a better definition of intelligence.

I'm defining intelligence as the ability to act upon the world in an effective way to achieve a goal. GPT-4's "goal" (not necessarily in a conscious sense, just the thing it's been trained to do) is to output text that people would score highly, and it's extremely good at that. In that relatively narrow area, it's better than the average person by a good bit. The real question is, how well does it generalize? Earlier chess playing AI's couldn't do pretty much anything else. AlphaZero could learn to play Chess and Go, but in a sense was still two different AIs. GPT-4 was trained on text, but in the process also learned how to play chess (kinda, anyway!). Language models tend to make invalid moves, but often people are effectively asking them to play blind chess and keep the whole board state in mind, and I'd probably do that in the same situation.

> Current state-of-the-art AI does not really scare me. Even on it's current trajectory, I don't see AI's impact on the planet being that much different from the status quo in a decade.

Ok, so that's the crux. I'm also not scared by current state-of-the-art, though I think it will transform the world. What I'm worried about is when we make something that doesn't just destroy jobs, but does every cognitive task way better than us. I can see it taking 20 or more years to reach that point, or something closer to 5, and it's really hard to say which it'll be. Maybe I'm overreacting, and there will be another AI winter. Or maybe all this money pouring into AI will result in someone stumbling onto something new.

I'm thinking about this, and I think there is definitely a possibility that you're right, and I really hope you are. I wouldn't bet humanity on it, on, of course, but I am a bit more hopeful than when I started writing this comment, so thanks for engaging with me on it.