Hacker News new | ask | show | jobs
by ilaksh 941 days ago
We don't have a dangerous AI yet, but because we are talking about an existential risk, yes we do want to start thinking about it now.

AI does not have to be alive or truly conscious or anything to be dangerous. It just needs to be very effective at problem solving and have someone take the guardrails off and tell it to act in a self-interested way. Especially without humans in the loop.

It seems likely that the systems will continue to get much faster and more robust in terms of problem solving. It is easy to anticipate the possibility within just a few years of agent swarms being connected to pretty much everything (as directed by their human owners), and no human being able to compete. This will be a precarious position.

2 comments

But it can't act in self interest way bc of hardware&phisical limitations. I mean what can it do worst case? We have both systems to prevent abuse for critical things & resources to power such an ai are limited too(hardware, energy, network connection). It can't even replicate to create self powered robots like in matrix fantasy bc we don't even have such fully automated production. So about what risks are we talikng here? Imo worst case it can mess up with stock market...
Maybe I am just daft but how is this not a simple privelege escalation issue? Just don't give it admin, what's the problem?
Because in software we've defined an environment where every sw method of gaining privilege/escalation is hardened. As a result, usually the most reliable way to get escalation is via social engineering.

LLMs exist in an environment whose vulnerabke surface area is social engineering. How do you lock down a system against all possible social engineering?

Oh yah and the "system" isn't any computer system somewhere, it's the entire world itself.

So one of the problems the safetiests are trying to solve is How do you protect a messy system, the size/complexity of the entire world from social engineering? The answer is clearly not using traditional approaches which consistently fail.

And that's just one of the problems they're trying to solve.

Does my conflation here make any sense or have any applicabillity even tho the system is far more distributed or...I don't think I have a sufficiently robust mental model of any of this but I just don't know enough to dispute it.
But why does the LLM itself have "access to root" whatver that means in the OS equivalent of whatever it is they operate within in terms of the larger context?
There is a strong likelihood that these agent swarms will be significantly smarter and more effective at not only solving problems, but also operating in sync and spreading information and software than any human or group of humans. Put that in the context of the track record of avoiding privilege escalation we have with human actors, and the idea that these systems are connected to critical infrastructure and military assets.