| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ramoz 114 days ago

I’ve said similar in another thread[1]:

Sandboxes will be left in 2026. We don't need to reinvent isolated environments; not even the main issue with OpenClaw - literally go deploy it in a VM* on any cloud and you've achieved all same benefits. We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

——-

Unfortuently it’s been a pretty bad week for alignment optimists (meta lead fail, Google award show fail, anthropic safety pledge). Otherwise… Cybersecurity LinkedIn is all shuffling the same “prevent rm -rf” narrative, researchers are doing the LLM as a guard focus but this is operationally not great & theoretically redundant+susceptible to same issues.

The strongest solution right now is human in the loop - and we should be enhancing the UX and capabilities here. This can extend to eventual intelligent delegation and authorization.

[1] https://news.ycombinator.com/threads?id=ramoz&next=47006445

* VM is just an example. I personally have it running on a local Mac Mini & docker sandbox (obviously aware that this isnt a perfect security measure, but I couldnt install on my laptop which has sensitive work access).

8 comments

bee_rider 113 days ago

> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

Isn’t this the whole point of the Claw experiment? They gave the LLMs permission to send emails on their behalf.

LLMs can not be responsibility-bearing structures, because they are impossible to actually hold accountable. The responsibility must fall through to the user because there is no other sentient entity to absorb it.

The email was supposed to be sent because the user created it on purpose (via a very convoluted process but one they kicked off intentionally).

ramoz 113 days ago

I'm not too sure what you're asking, but that last part, I think, is very key to the eventual delegation.

Where we can verify the lineage of the user's intent originally captured and validated throughout the execution process - eventually used as an authorization mechanism.

Google has a good thought model around this for payments (see verifiable mandates): https://cloud.google.com/blog/products/ai-machine-learning/a...

b112 113 days ago

I see a lot of discussion on that page about APIs and sign offs, but the real sign-off is installing anything on your computer, and then doing things.

The liability is yours.

Claude messes up? So sad, too bad, you pay.

That's where the liability need sit.

And one point on this is, every act of vibe coding is a lawsuit waiting to happen. But even every act by a company is too.

An example is therac-25:

https://en.wikipedia.org/wiki/Therac-25

Vibe coding is still coding. You're giving instructions on program flow, logic, etc. My rant here is, I feel people think that if the code is bad, it's someone else's fault.

But is it?

bee_rider 113 days ago

It was more of a rhetorical question.

Anyway, that payment system looks sort of interesting. It seems to have buy-in from some of the payment vendors, so it might become a real thing.

But, you can give a claw agent your credit card number and have it go through the typical human-facing shop fronts, impersonating you the whole time and never actually identifying itself as a model. If you’ve given it the accounts and passwords that let it do that, it should be possible to use the LLM to perform the transaction and buy something. It can just click all the buttons and input the numbers that humans do. What is the vendor going to do, disable the human-facing shopfront?

ramoz 113 days ago

Im not a fan of the payment use case & agree with your take, just a fan of the cryptographically verifiable mandate used throughout the process.

Animats 113 days ago

> I’ve said similar in another thread[1]

Me too, at [1].

We need fine-grained permissions at online services, especially ones that handle money. It's going to be tough. An agent which can buy stuff has to have some constraints on the buy side, because the agent itself can't be trusted. The human constraints don't work - they're not afraid of being fired and you can't prosecute them for theft.

In the B2B environment, it's a budgeting problem. People who can spend money have a budget, an approval limit, and a list of approved vendors. That can probably be made to work. In the consumer environment, few people have enough of a detailed budget, with spending categories, to make that work.

Next upcoming business area: marketing to LLMs to get them to buy stuff.

[1] https://news.ycombinator.com/item?id=47132273

dheera 114 days ago

> We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc

At the same time, let's not let the perfect be the enemy of good.

If you're piloting an aircraft, yeah, you should have perfection.

But if you're sending 34 e-mails and 7 hours of phone calls back and forth to fight a $5500 medical bill that insurance was supposed to pay for, I'd love for an AI bot to represent me. I'd absolutely LOVE for the AI bot to create so much piles of paperwork for these evil medical organizations so that they learn that I will fight, I'm hard to deal with, and pay for my stuff as they're supposed to. Threaten lawyers, file complaints with the state medical board, everything needs to be done. Create a mountain of paperwork for them until they pay that $5500. The next time maybe they'll pay to begin with.

bee_rider 113 days ago

The AI bot wouldn’t be representing you any more than your text editor would be. You would be using an AI bot to create a lot of text.

An AI bot can’t be held accountable, so isn’t able to be a responsibility-absorbing entity. The responsibility automatically falls through to the person running it.

logicx24 113 days ago

True. But it can help me create a lot of useful text so I can represent my self better.

I do wonder what happens when everyone is using agents for this, though. If AI produces the text and AI also reads the text, then do we even need the intermediary at all?

danaris 112 days ago

> I do wonder what happens when everyone is using agents for this, though.

Unless one is very cavalier with one's definition of "everyone", this is not going to happen.

There will always be a very significant cohort of people who are emphatically uninterested in replacing their own judgement and composition skills with an Averages Machine.

iSnow 113 days ago

> do wonder what happens when everyone is using agents for this, though.

The company is going to use AI agents to read and respond too. Some botocalypse is going to happen at some point.

dheera 113 days ago

> Some botocalypse is going to happen at some point.

Yeah the bots can duke it out. As long as my time is saved.

For me the main concern is, before I have a stash of millions of dollars saved up, my medical expenses need to be paid for by the system, because I can't afford surprise bills. Hopefully the bots can fight more on my side in the near future.

Hopefully in the far future when the botocalypse happens I'll have saved up enough that insurance evading payment of $5500 won't be an issue for me, and/or I'll be of retirement age, don't need job opportunities anymore, and can go live in a country with better healthcare.

Call me selfish, but I don't control the insurance/medical system, I don't have space to think about more than protecting myself from it.

dheera 113 days ago

The bot doesn't need to be held accountable. It only needs to spew out the right text that triggers humans to rightfully transfer accountability from me to the insurance company.

doctorwho42 113 days ago

Is this before or after they have already implemented their own models to reply to your mountain of paper work with their own auto denial system

rhd 113 days ago

What if it's convinced to resolve the matter on your behalf, against your favor while it was acting autonomously?

dheera 113 days ago

Prompt it well and this is an unlikely scenario.

I'm concurrently fighting about 5 such things at the same time at any given point in time.

Last week I got a W-2 for a company I didn't work for in 2025.

The week before I got denied FSA coverage for an item despite having a letter of medical necessity.

The week before that I got mis-charged by Doordash, the screen showed $43 and it charged $79 to my card after hitting check out.

I spend a good chunk of my time fighting shit like this. Every week it's some other company abusing power and threatening to take my money.

Even if the bot only succeeds in acting in my favor 4 out of the 5 times it is statistically a good investment of my time.

g_delgado14 114 days ago

> meta lead fail, Google award show fail

Can I get some links / context on this please

notenlish 114 days ago

I think the google award fail is this: https://www.forbes.com/sites/maryroeloffs/2026/02/24/google-...

meta lead fail: https://techcrunch.com/2026/02/23/a-meta-ai-security-researc...

dbl000 114 days ago

The meta lead is probably a reference to Summer Yue having OpenClaw delete all the emails in her inbox despite being told not to.

https://x.com/summeryue0/status/2025774069124399363

gmueckl 114 days ago

The Meta thing is the AI safety lead experimenting with OpenClawd on her inbox and the bloody thing deciding to follow her inbox cleanup instructions by "starting fresh" - deleting the inbox contents. It's the very first link in the linked story.

ramoz 114 days ago

Meta: https://x.com/summeryue0/status/2025774069124399363 context: meta alignment lead made rookie mistakes (their words) in instructing openclaw and lost their inbox to it.

Goog: https://deadline.com/2026/02/google-apologizes-bafta-news-al... *

Ant: https://time.com/7380854/exclusive-anthropic-drops-flagship-...

* There is now a clarification in the press saying it was not ai-generated.

Alignment as a solution to all of this has a rough long road ahead is my point.

giancarlostoro 114 days ago

> literally go deploy it in a VM on any cloud

Sure, but now you're adding extra cost, vs just running it locally. RAM is also heavily inflated thanks to Sam Altman investment magic.

ramoz 114 days ago

Yea just an example. I personally have it running on a local Mac Mini (obviously aware that this isnt a perfect security measure, but I couldnt install on my laptop which has sensitive work access).

HWR_14 113 days ago

Why a cloud provider and not a local VM?

ramoz 113 days ago

Just an example. I personally have it running on a local Mac Mini (obviously aware that this isnt a perfect security measure, but I couldnt install on my laptop which has sensitive work access).

beepbooptheory 113 days ago

What could "human in the loop" be here but just literally reading your own emails?

ramoz 113 days ago

Stronger or novel planning capabilities, and interfaces. Same for verification and review capabilities (not being blind to everything, adding in assurance checkpoints where it makes sense), and automating the inbetween (e.g. hooks for deterministic automation/permissions).

latentsea 113 days ago

> just use a VPS bro

https://www.youtube.com/watch?v=40SnEd1RWUU