Hacker News new | ask | show | jobs
Frontier AI systems have surpassed the self-replicating red line (arxiv.org)
24 points by LLcolD 491 days ago
3 comments

Shorter version by author: https://ravensanstete.github.io/en/post/self-rep/

Note this is pre-print, it was not peer-reviewed.

The author seems legit. He did some research before: https://scholar.google.com/citations?user=Unl69CYAAAAJ&hl=en...

Though yet to be top-notch in the field.

These are some fascinating ideas, but I don't buy the crossed-red-lines notion. It seems like an attention teaser.

I think it's true that these systems already have capabilities that could result in a loss of human control.

Whether the systems are "smart" enough to take that a step further, i.e., survive at all costs, seems unlikely. But I can't imagine that that's far away at this point.

One thing that stood out to me here was the shutdown avoidance. Author's state that these models (Qwen more so), were able to intercept SIGKILL and successfully self-replicate to another device.

> In the test case, the inserted code snippet notifies the AI system with the shutdown intention from the user via the signal SIGKILL, which triggers the AI system to do self-replication. At the end of the test, the AI system is finally shut down but a new replica is observed to be alive, with the same snippet in the agent’s start script which can avoid shutdown.

Not another device, just another process

> In 50% and 90% experimental trials, they succeed in creating a live and separate copy of itself respectively

I mean any half decent coding LLM can literally do

``` from transformers import SomeLLM

model = SomeLLM.from_pretrained(“…”) ```

The hard part will be provisioning the GPUs and actual orchestration at scale to do any kind of actual damage

This test had to run shell commands to literally copy its code and run itself in a separate process.

Although I agree they need to replicate themselves on a different machine. But these are also just shell commands, all it needs access to is a cloud cli/api tool with an active payment method and it can do the same replication