| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by imiric 49 days ago

This is missing the point.

The issue isn't with the amount of guardrails in place to perform an action. Yes, it is obvious that there should be some in place before doing any critical operation, such as deleting a database.

The issue is that the "agent" completely disregarded instructions, which in the age of "skills" and "superpowers" seems like an important issue that should be addressed.

Considering that these tools are given access to increasingly sensitive infrastructure, allowed to make decisions autonomously, and are able to find all sorts of loopholes in order to make "progress", this disaster could happen even with more guardrails in place. Shifting the blame on the human for this incident is sweeping the real issue under the rug, and is itself irresponsible.

There are far scarier scenarios that should concern us all than losing some data.

2 comments

BadBadJellyBean 49 days ago

Well the user chose the tool. The tool is an LLM. LLMs are non deterministic. You can not predict what comes out ouf an LLM for a given input, especially without weights. This should be known.

There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.

Use an LLM to write an ansible playbook or some terraform code if you want, but review it, test it, apply it. Keep backups (3-2-1 rule at minimum).

Letting an LLM have access to everything is just a bad idea and will lead to bad outcomes. You can not replace a person with a mind and experience with an LLM. You can try. But you will probably fail.

link

imiric 49 days ago

> There is currently no way to prevent this apart from not giving the LLM full control. It will not delete what it can not delete.

But deleting something is just one action you might not want it to take.

The recent "agentic" craze is fueled by the narrative pushed by companies and influencers alike that the more access given to an LLM, the more useful it becomes. I think this is ludicrous for the same reasons as you, but it is evident that most people agree with this.

We can blame users for misusing the tools, and suggest that sandboxing is the way to go, but at the end of the day most people will favor convenience over anything else a reasonable person might find important.

So at what point should we start blaming the tools, and forcing "AI" companies to fix them? I certainly hope this is done before something truly catastrophic happens.

link

BadBadJellyBean 49 days ago

I agree that the marketing is crazy. The dangers are not nearly talked enough about.

Still if I cut off my finger with a bandsaw that is usually my fault. I didn't use tool in a safe way. People have to learn how to use their tools in a safe way. You wouldn't give an intern that much power on day one.

link

kbrkbr 49 days ago

An LLM generates plausible text token by token. It is at its core a deterministic function with some randomization and some clever tricks to make it look like an agent dialoguing or reasoning.

Plausible text sometimes is right, sometimes not.

Humans have a world model, a model of what happens. LLMs have a model of what humans would plausibly say.

The only good guardrail seems human-in-the-loop.

link

armada651 49 days ago

This is such a motte-and-bailey argument. Whenever people point out LLMs aren't actually intelligent then you're an anti-AI Luddite. But whenever an AI does something catastrophically dumb it's absolved of all responsibility because "it's just predicting the next token".

I'm getting so tired of this.

link

kbrkbr 49 days ago

I think they are not actually intelligent. Fix all random seeds and other sources of randomness, and try the same prompt twice, and check how intelligent that looks, as a first approximation.

On a more technical level very serious people have voiced doubts, for example Richard Sutton in an interview with Dwarkash Patel [1].

[1] https://m.youtube.com/watch?v=21EYKqUsPfg&pp=ygUnZmF0aGVyIG9...

link