| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by unholiness 946 days ago

A common refrain in AI safety circles not to engage in "Sci Fi"[0], or outlining a specific bad scenario. The specifics tend to distract from the larger, more important point that most scenarios involving intelligent, powerful agents with different goals from us end badly.

But since you asked specifically, this is one thought experiment of a somewhat near-term danger:

Imagine the tourism department of New Zealand starts using software to write personalized marketing emails. It starts out benign, but after some funding cuts they end up leaning more and more on the AI model and giving it higher and higher-level instructions, broadly telling it to use emails to maximize the public opinion of New Zealand. The AI model realizes that New Zealand's strongest boost in popularity was caused by its excellent handling of COVID, and determines the best way to maximize its goal is to start another pandemic. The model knows about published papers describing which specific proteins maximize human infectivity and transmission. It begins a broad phishing attack of several viral research labs, emailing the techs attempting to convince them that their next experiment is to create a recombinant virus with these particular RNA sequences added, using poor safety protocols. Somewhere, one of these lab techs becomes patient zero in a species-threatening pandemic of unprecedented scale.

The preventions you can imagine for a scenario like this are hard to generalize and harder to enforce. They get even harder as AI becomes better at persuasion and reasoning, and as technologies allow bigger impacts with smaller actions. AI safety is a whole field of research trying to find generalizable and enforceable solutions to problems like these, and there's certainly no consensus that we're converging on those solutions to the problems faster than we're creating them.

[0]https://www.youtube.com/watch?v=JVIqp_lIwZg