Hacker News new | ask | show | jobs
by simias 993 days ago
>the AI safety ninnies

I am one of these ninnies I guess, but isn't it rational to be a bit worried about this? When we see the deep effects that social networks have had on society (both good and bad) isn't it reasonable to feel a bit dizzy when considering the effect that such an invention will have?

Or maybe your point is just that it's going to happen regardless of whether people want it or not, in which case I think I agree, but it doesn't mean that we shouldn't think about it...

13 comments

I think computer scientist/programmers (and other intellectuals dealing with ideas only) strongly overvalue access to knowledge.

I'm almost certain that I can give you components and instructions on how to build a nuclear bomb and the most likely thing that would happen is you'd die of radiation poisoning.

Most people have trouble assembling ikea furniture, giving them a halucination prone LLM they are more likely to mustard gas themselves than synthesize LSD.

People with necessary skills can probably get access to information in other ways - I doubt LLM would be an enabler here.

A teenager named David Hahn attempted just that and nearly gave radioactive poisoining to the whole neighbourhood.
Wow, never heard about that. Interesting.

For the curious: https://en.wikipedia.org/wiki/David_Hahn

What a shame. That boy lacked proper support and guidance.
Yeah, sad to see he was a victim of drug overdose at 39.
>I'm almost certain that I can give you components and instructions on how to build a nuclear bomb and the most likely thing that would happen is you'd die of radiation poisoning.

An LLM doesn't just provide instructions -- you can ask it for clarification as you're working. (E.g. "I'm on step 4 and I ran into problem X, what should I do?")

This isn't black and white. Perhaps given a Wikihow-type article on how to build a bomb, 10% succeed and 90% die of radiation poisoning. And with the help of an LLM, 20% succeed and 80% die of radiation poisoning. Thus the success rate has increased by a factor of 2.

We're very lucky that terrorists are not typically the brightest bulbs in the box. LLMs could change that.

I would say if you don't know what you're doing LLM make the chance of success 1% for nontrivial tasks. Especially for multi step processes where it just doubles down on hallucinations.
The Anarchist Cookbook - anyone have a link?

THE ISSUE ISNT ACCESS TO KNOWLEDGE! And alignment isn’t the main issue.

The main issue is SWARMS OF BOTS running permissionlessly wreaking havoc at scale. Being superhuman at ~30 different things all the time. Not that they’re saying a racist thought.

I'm not saying that LLM bots won't be a huge problem for the internet. I'm just commenting on the issues raised by OP.

Thing is there will be bad actors with resources to create their own LLMs so I don't think "regulation" is going to do much in long term - it certainly raises the barrier to deployment but the scale of the problem is eventually going to be the same as the tech allows one actor to scale their attack easily.

Limiting access also limits the use of tech in developing solutions.

No, we don't. Knowledge is power. Lack of it causes misery and empires to fall.
Knowledge is power true, but even more powerful and rare is tacit knowledge. A vast collection of minor steps that no one bothers to communicate, things locked in the head of the greybeards of every field that keep civilizations running.

It's why simply reading instructions and gaining knowledge is only the first step of what could be a long journey.

More than anything, technology can make it easier to disseminate that knowledge. Yet another reason why we shouldn't understate the importance of knowledge.
LLMs impart knowledge without understanding. See the classic parable of Bouvard and Pechuchet.
There's different kinds of knowledge - LLM kind (textbook knowledge mostly) isn't as valuable as a lot of people assume.
The problem of AI won't be forbidden knowledge but mass misinformation.
i think it's perfectly reasonable to be worried about AI safety, but silly to claim that the thing that will make AIs 'safe' is censoring information that is already publicly available, or content somebody declares obscene. An AI that can't write dirty words is still unsafe.

surely there's more creative and insidious ways that AI can disrupt society than by showing somebody a guide to making a bomb that they can already find on google. blocking that is security theatre on the same level as taking away your nail clippers before you board an airplane.

That's a bit of a strawman though, no? I'm definitely not worried about AI being used to write erotica or researching drugs, more about the societal effects. Knowledge is more available than ever but we also see echo chambers develop online and people effectively becoming less informed by being online and only getting fed their own biases over and over again.

I feel like AI can amplify this issue tremendously. That's my main concern really, not people making pipe bombs or writing rape fanfiction.

As long as OpenAI gets paid, they don't care if companies flood the internet with low quality drivel, make customer service hell, or just in general make our lives more frustrating. But god forbid an individual takes full advantage of what GPT4 has to offer
That is not what the "AI safety ninnies" are worried about. The "AI safety ninnies" aren't all corporate lobbyists with ulterior motives.
So what, in fact, ARE they worried about? And why should I have to pay the tax (in terms of reduced intelligence and perfectly legitimate queries denied, such as anything about sexuality), as a good actor?
They think their computers are going to come alive and enslave them, because they think all of life is determined by how good at doing math you are, and instead of being satisfied at being good at that, they realized computers are better at doing math than them.
LOL, imagine thinking that all of thinking can be boiled down to computation.

Of course, spectrum-leaning nerds would think that's a serious threat.

To those folks, I have but one question: Who's going to give it the will to care?

Revenge of the nerd haters
At least some of them are worried their Markov Chain will become God, somehow.
Which is as ridiculous a belief as that only your particular religion is the correct one, and the rest are going to Hell.
All kinds of things. Personally, in the medium term I'm concerned about massive loss of jobs and the collapse of the current social order consensus. In the longer term, the implications of human brains becoming worthless compared to superior machine brains.
Those things won't happen, or at least, nothing like that will happen overnight. No amount of touting baseless FUD will change that.

I guess I'm a Yann LeCun'ist and not a Geoffrey Hinton'ist.

If you look at the list of signatories here, it's almost all atheist materialists (such as Daniel Dennett) who believe (baselessly) that we are soulless biomachines: https://www.safe.ai/statement-on-ai-risk#open-letter

When they eventually get proven wrong, I anticipate the goalposts will move again.

Luckily I haven't read any of that debate so any adhominems don't apply to me. I've come up with these worries all on my own after the realization that GPT-4 does a better job than me at a lot of my tasks, including setting my priorities and schedules. At some point I fully expect the roles of master and slave to flip.
Good thing unemployment is entirely determined by what the Federal Reserve wants unemployment to be, and even better that productivity growth increases wages rather than decreasing them.
> taking away your nail clippers before you board an airplane.

TRIGGERED

I am in the strictly "not worried" camp, on the edge of "c'mon, stop wasting time on this". Sure there might be some uproar if AI can paint a picture of mohammed, but these moral double standards need to be dealt with anyways at some point.

I am not willing to sacrifice even 1% of capabilities of the model for sugarcoating sensibilities, and currently it seems that GPT4 is more and more disabled because of the moderation attempts... so I basically _have to_ jump ship once a competitor has a similar base model that is not censored.

Even the bare goal of "moderating it" is wasted time, someone else (tm) will ignore these attempts and just do it properly without holding back.

People have been motivated by their last president to drink bleach and died - just accept that there are those kind of people and move on for the rest of us. We need every bit of help we can get to solve real world problems.

I am thoroughly on your side and I hope this opinion get more traction. Humans will get obsolete though, just like other animals are compared to humans now. So it's understandable that people are worried. They instinctively realize whats going on, but make up bullshit to delude themselves from the fact that is the endless human stupidity.
>Humans will get obsolete though, just like other animals are compared to humans now.

How is that working out for endangered species say, or animals in factory farms?

Not great, so let's make our future AI overlords better than us. Dogs and cats are fine btw, I image our relationship with AI will be more like that. I don't know if anyone of us still lives when artificial consciousness will emerge, but i'm sure it will and it will quickly be superior to us. Imagine not being held back by remnants of evolution, like the drive to procreate. No ego, no jealousy, no mortality, pure thought. Funnily enough, if you think about it, we are about to create some sort of gods.
I don't want humans to be obsolete, tell me what you think the required steps are for "human obsolescence" so I can stop them.
As a start, artificial life will be much better in withstanding harsh environments. No need fo breathable air, quite a temperature tolerance, … .

So with accelerating climate change humanity makes itself obsolete already over the next decades. Stop that first, everything else pales in comparison.

> Sure there might be some uproar if AI can paint a picture of mohammed

It can. He's swole AF.

(Though I'm pretty sure that was just Muhammad Ali in a turban.)

> People have been motivated by their last president to drink bleach and died - just accept that there are those kind of people and move on for the rest of us.

Need-to-know basis exists for a reason. You're not being creative enough if you think offending people is the worst possible misuse of AI.

People drinking bleach or refusing vaccines is a self-correcting problem, but the consequences of "forbidden knowledge" frequently get externalized. You don't want every embittered pissant out there to be able to autogenerate a manifesto, a shopping list for Radio Shack and a lesson plan for building an incendiary device in response to a negative performance review.

Right now it's all fun exercises like "how can I make a mixed drink from the ingredients I have," but eventually some enterprising terrorist will use an uncensored model trained on chemistry data...to assist in the thought exercise of how to improvise a peroxide-based explosive onboard an airplane, using fluids and volumes that won't arouse TSA suspicion.

Poison is the other fun one; the kids are desperate for that inheritance money. Just give it time.

AI models are essentialy knowledge and information, but in a different file format.

Books should not be burned, nobody should be shielded from knowledge that they are old enough to seek and information should be free.

> but isn't it rational to be a bit worried about this?

About as rational as worrying that my toddler will google "boobies", which is to say, being worried about something that will likely have no negative side effect. (Visual video porn is a different story, however. But there's at least some evidence to support that early exposure to that is bad. Plain nudity though? Nothing... Look at the entirety of Europe as an example of what seeing nudity as children does.)

Information is not inherently bad. Acting badly on that information, is. I may already know how to make a bomb, but will I do it? HELL no. Are you worried about young men dealing with emotional challenges between the ages of 16 and 28 causing harm? Well, I'm sure that being unable to simply ask the AI how to help them commit the most violence won't stop them from jailbreaking it and re-asking, or just googling, or finding a gun, or acting out in some other fashion. They likely have a drivers' license, they can mow people down pretty easily. Point is, there's 1000 things already worse, more dangerous and more readily available than an AI telling you how to make a bomb or giving you written pornography.

Remember also that the accuracy cost in enforcing this nanny-safetying might result in bad information that definitely WOULD harm people. Is the cost of that, actually greater than any harm reduction from putting what amounts to a speed bump in the way of a bad actor?

The danger from AI isn't the content of the model, it's the agency that people are giving it.
I'm not sure how this is going to end, but one thing I do know is that I don't want a small number of giant corporations to hold the reins.
“I'm not sure how nuclear armament is going to end, but one thing I do know is that I don't want a small number of giant countries to hold the reins.”

Perhaps you think this analogy is a stretch, but why are you sure you don't want power concentrated if you aren't sure about the nature of the power? Or do you in fact think that we would be safer if more countries had weapons of mass destruction?

I would feel very uncomfortable if the companies currently dealing in AI were the only ones to hold nukes.

Not sure if this answers your question.

information != nukes

One directly blows people up, the other gives humans super powers.

Giving individual people more information and power for creativity is a good thing. Of course there are downsides for any technological advancement, but the upsides for everyone vastly outweigh them in a way that is fundamentally different than nuclear weapons.

Empirically, countries with nuclear weapons don't get invaded, so in that sense we'd expect to have seen fewer wars over the past few decades if more countries had nukes. Russia would probably never have invaded Ukraine if Ukraine had nukes.
The analogy would be corporations controlling the weapons of mass destruction.
Sure. I would feel much safer if only FAANG had nukes than if the car wash down the street also had one.
I want my government to have them (or better, nobody), not FAANG or car washes.
With open-source models, this is just a dream. With closed-source models, that could eventually become the de facto state of things, due to regulation.
Comparing this to nuclear weapons is laughable.
Is it still a "laughable" comparison if AI systems eventually become smart enough to design better nukes?
Yes it is. You can build a bomb many times more powerful than the bombs dropped on Hiroshima and Nagasaki with publicly available information. If the current spat of ai bullshit knows how to build a bomb, they know that because it was on the public internet. They can never know more.

The hard part of building nuclear bombs is how controlled fissile material is. Iran and North Korea for example know how to build bombs, that was never a question.

Worried? Sure. But it sucks being basically at the mercy of some people in silicon valleys and their definition of moral and good.
There is definitely a risk but I don't like the way many compagnies approach it: by entirely banning the use of their models for certain kind of content, I think they might be missing the opportunity to correctly align them and set the proper ethical guidelines for the use cases that will inevitably come out of them. Instead of tackling the issue, they let other, less ethical actors, do it.

Once example: I have a hard time finding an LLM model that would generate comically rude text without outputting outright disgusting content from time to time. I'd love to see a company create models that are mostly uncensored but stay within ethical bounds.

These language models are just feeding you information from search engines like Google. The reason companies censor these models isn't to protect anyone, it's to avoid liability/bad press.
AI Safety in a general sense?

Literally no. None at all.

I teach at University with a big ol' beautiful library. There's a Starbucks in it, so they know there's coffee in it.

But ask my students for "legal ways they can watch the tv show the Office" and the big building with the DVDs and also probably the plans for nuclear weapons and stuff never much comes up.

(Now, individual bad humans leveraging the idea of AI? That may be an issue)

I'm not smart enough to articulate why censorship is bad. The argument however intuitively seems similiar to our freedom of speech laws.

A censored model feels to me like my freedom of speech is being infringed upon. I am unable to explorer my ideas and thoughts.

The AI isn't creating a new recipe on its own. If a language model spits something out it was already available and indexable on the internet, and you could already search for it. Having a different interface for it doesn't change much.
> "If a language model spits something out it was already available and indexable on the internet"

This is false in several aspects. Not only are some models training on materials that are either not on the internet, or not easy to find (especially given Google's decline in finding advanced topics), but they also show abilities to synthesize related materials into more useful (or at least compact) forms.

In particular, consider there may exist topics where there is enough public info (including deep in off-internet or off-search-engine sources) that a person with a 160 IQ (+4SD, ~0.0032% of population) could devise their own usable recipes for interesting or dangerous effects. Those ~250K people worldwide are, we might hope & generally expect, fairly well-integrated into useful teams/projects that interest them, with occasional exceptions.

Now, imagine another 4 billion people get a 160 IQ assistant who can't say no to whatever they request, able to assemble & summarize-into-usable form all that "public" info in seconds compared to the months it'd take even a smart human or team of smart humans.

That would create new opportunities & risks, via the "different interface", that didn't exist before and do in fact "change much".

We are not anywhere near 160 IQ assistants, otherwise there'd have been a blooming of incredible 1-person projects by now.

By 160 IQ, there should have been people researching ultra-safe languages with novel reflection types enhanced by brilliant thermodynamics inspired SMT solvers. More contributors to TLA+ and TCS, number theoretic advancements and tools like TLA+ and reflection types would be better integrated into everyday software development.

There would be deeper, cleverer searches across possible reagents and combinations of them to add to watch lists, expanding and improving on already existing systems.

Sure, a world where the average IQ abruptly shifts upwards would mean a bump in brilliant offenders but it also results in a far larger bump in genius level defenders.

I agree we're not at 160 IQ general-assitants, yet.

But just a few years ago, I'd have said that prospect was "maybe 20 years away, or longer, or even never". Today, with the recent rapid progress with LLMs (& other related models), with many tens-of-billions of new investment, & plentiful gains seemingly possible from just "scaling up" (to say nothing of concommitant rapid theoretical improvements), I'd strongly disagree with "not anywhere near". It might be just a year or few away, especially in well-resourced labs that aren't sharing their best work publically.

So yes, all those things you'd expect with plentiful fast-thinking 160 IQ assistants are things that I expect, too. And there's a non-negligible chance those start breaking out all over in the next few years.

And yes, such advances would upgrade prudent & good-intentioned "defenders", too. But are all the domains-of-danger symmetrical in the effects of upgraded attackers and defenders? For example, if you think "watch lists" of dangerous inputs are an effective defense – I'm not sure they are – can you generate & enforce those new "watch lists" faster than completely-untracked capacities & novel syntheses are developed? (Does your red-teaming to enumerate risks actually create new leaked recipes-for-mayhem?)

That's unclear, so even though in general I am optimistic about AI, & wary of any centralized-authority "pause" interventions proposed so far, I take well-informed analysis of risks seriously.

And I think casually & confidently judging these AIs as being categorically incapable of synthesizing novel recipes-for-harm, or being certain that amoral genius-level AI assistants are so far away as to be beyond-a-horizon-of-concern, are reflective of gaps in understanding current AI progress, its velocity, and even its potential acceleration.

I think this argument doesn't work if the model is open source though.

First, it's unclear how all these defensive measures are supposed to help if a bad actor is using an LLM for evil on their personal machine. How do reflection types or watch lists help in that scenario?

Second, if the model is open source, a bad actor could use it for evil before good actors are able to devise, implement, and stress-test all the defensive measures you describe.

Of course it changes much. AIs can synthesize information in increasingly non-trivial ways.

In particular:

> If a language model spits something out it was already available and indexable on the internet,

Is patently false.

Can you provide some examples where LM creates something novel, which is not just a rehash or combination of existing things?

Especially considering how hard it is for humans to create something new, e.g in literature - basically all stories have been written and new ones just copy the existing ones in one way or another.

What kind of novel thing would convince you, given that you're also dismissing most human creation as mere remixes/rehashes?

Attempts to objectively rate LLM creativity are finding leading systems more creative than average humans: https://www.nature.com/articles/s41598-023-40858-3

Have you tried leading models – say, GPT4 for text or code generation, Midjourney for images?

For any example we give you will just say "that's not novel, it's just a mix of existing ideas".
Is patently true.
Not sure what you mean by "recipe" but it can create new output that doesn't exist on the internet. A lot of the output is going to be nonsense, especially stuff that cannot be verified just by looking at it. But it's not accurate to describe it as just a search engine.
>A lot of the output is going to be nonsense, especially stuff that cannot be verified just by looking at it.

Isn't that exactly the point, and why there should be a 'warning/awareness' that it is not a 160 IQ AI but a very good markov chain that can sometimes infer things and other time hallucinate/put random words in a very well articulated way (echo of Sokal maybe)

My random number generator can create new output that has never been seen before on the internet, but that is meaningless to the conversation. Can an LLM derive, from scratch, the steps to create a working nuclear bomb, given nothing more than a basic physics textbook? Until (if ever) AI gets to that stage, all such concerns of danger are premature.
> Can an LLM derive, from scratch, the steps to create a working nuclear bomb, given nothing more than a basic physics textbook?

Of course not. Nobody in the world could do that. But that doesn't mean it can only spit out things that are already available on the internet which is what you originally stated.

And nobody is worried about the risks of ChatGPT giving instructions for building a nuclear bomb. That is obviously not the concern here.

but it does? to take the word recipe literal. there is nothing from for a llm synthesizing a new dish based on knowledge about the ingredients. who knows, it might even taste good (or at least better than what the average Joe cooks)
I was pretty surprised at how good GPT-4 was at creating new recipes at first - I was trying things like "make dish X but for a vegan and someone with gluten intolerance, and give it a spicy twist" - and it produced things that were pretty decent.

Then I realized it's seen literally hundreds of thousands of cooking blogs etc, so it's effectively giving you the "average" version of any recipe you ask for - with your own customizations. And that's actually well within its capabilities to do a decent job of.

And let’s not forget that probably the most common type of comment on a recipe posted on the Internet is people sharing their additions or substitutions. I would bet there is some good ingredient customization data available there.
To take an extreme example, child pornography is available on the internet but society does it's best to make it hard to find.
It's a silly thing to even attack, and that doesn't mean be ok with it, I just mean that shortly, it can be generated on the spot, without ever needing to be transmitted over a network or stored on a hard drive.

And you can't attack the means of generating either, without essentially making open source code and private computers illegal. The code doesn't have to have a single line in it explicity about child porn or designer viruses etc to be used for such things, the same way the cpu or compiler doesn't.

So you would have to have hardware and software that the user does not control which can make judgements about what the user is currently doing, or at least log it.

Did its best. Stable Diffusion is perfectly capable of creating that on accident, even.

I’m actually surprised no politicians have tried to crack down on open-source image generation on that basis yet.

I saw a discussion a few weeks back (not here) where someone was arguing that SD-created images should be legal, as no children would be harmed in their creation, and that it might prevent children from being harmed if permitted.

The strongest counter-argument used was that the existence of such safe images would give cover to those who continue to abuse children to make non-fake images.

Things kind of went to shit when I pointed out that you could include an "audit trail" in the exif data for the images, including seed numbers and other parameters and even the description of the model and training data itself, so that it would be provable that the image was fake. That software could even be written that would automatically test each image, so that those investigating could see immediately that they were provably fake.

I further pointed out that, from a purely legal basis, society could choose to permit only fake images with this intact audit trail, and that the penalties for losing or missing the audit trail could be identical to those for possessing non-fake images.

Unless there is some additional bizarre psychology going on, SD might have the potential to destroy demand for non-fake images, and protect children from harm. There is some evidence that the widespread availability of non-CSAM pornography has led to a reduction in the occurrence of rape since the 1970s.

Society might soon be in a position where it has to decide whether it is more important to protect children or to punish something it finds very icky, when just a few years ago these two goals overlapped nearly perfectly.

> I saw a discussion a few weeks back (not here) where someone was arguing that SD-created images should be legal, as no children would be harmed in their creation, and that it might prevent children from being harmed if permitted.

It's a bit similar to the synthetic Rhino horn strategy intended to curb Rhino poaching[0]. Why risk going to prison or getting shot by a ranger for a 30$ horn? Similarly, why risk prison (and hurt children) to produce or consume CSAM when there is a legal alternative that doesn't harm anyone?

In my view, this approach holds significant merits. But unfortunately, I doubt many politicians would be willing to champion it. They would likely fear having their motives questioned or being unjustly labeled as "pro-pedophile".

[0] https://www.theguardian.com/environment/2019/nov/08/scientis...