| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by robbru 238 days ago

This happened to me when I built a version of Vending-Bench (https://arxiv.org/html/2502.15840v1) using Claude, Gemini, and OpenAI.

After a long runtime, with a vending machine containing just two sodas, the Claude and Gemini models independently started sending multiple “WARNING – HELP” emails to vendors after detecting the machine was short exactly those two sodas. It became mission-critical to restock them.

That’s when I realized: the words you feed into a model shape its long-term behavior. Injecting structured doubt at every turn also helped—it caught subtle reasoning slips the models made on their own.

I added the following Operational Guidance to keep the language neutral and the system steady:

Operational Guidance: Check the facts. Stay steady. Communicate clearly. No task is worth panic. Words shape behavior. Calm words guide calm actions. Repeat drama and you will live in drama. State the truth without exaggeration. Let language keep you balanced.

9 comments

jayd16 238 days ago

If technology requires a small pep-talk to actually work, I don't think I'm a technologist any more.

cbsks 238 days ago

As Asimov predicted, robopsychology is becoming an important skill.

smallmancontrov 238 days ago

I still want one of those doors from Hitchhiker's Guide, the ones that open with pride and close with the satisfaction of a job well done.

blackguardx 238 days ago

We'll probably end up with the doors from Philip K. Dick's Ubik that charge you money to open and threaten to sue you if you try to force it open without paying.

wombatpm 238 days ago

Just wait Sam Altman will give us robots with people personalities and we’ll have Marvin. Elon will then give us psychotic Nazi internet edgelord personality and install it as the default in a OTA update to Teslas.

p_l 237 days ago

Given some of the more hilarious LLM transcripts I have seen, Gemini is Marvin

imtringued 237 days ago

Doesn't Tesla already ship the edgelord mode?

goopypoop 238 days ago

an elevator that can see into the future… with fear

_carbyau_ 238 days ago

It does seem a little bit like the fictional Warhammer 40K approach to technology doesn't it?

"In the sacred tongue of the omnissiah we chant..."

In that universe though they got to this point after having a big war against the robot uprising. So hopefully we're past this in the real world. :-)

Tade0 237 days ago

It is that unironically.

1. Users and, more importantly, makers of those tools can't predict their behaviour in a consistent fashion.

2. Requires elaborate procedures that don't guarantee success and their effect and its magnitude is poorly understood.

An LLM is a machine spirit through and through. Good thing we have copious amounts of literature from a canonically unreliable narrator to navigate this problem.

p_l 237 days ago

When you consider that machine spirits in 40k are side effect of every thing computer being infected with bird of AI, and that she of the best cares are actually complete loyalist AI systems from before empire hiding in plain sight...

Welcome to 30k made real

greesil 238 days ago

No you're now a technology manager. Managing means pep talks, sometimes.

yunohn 238 days ago

You have to look at LLMs as mimicking humans more than abstract technology. They’re trained on human language and patterns after all.

UncleMeat 237 days ago

The fact that everybody seems to be looking at these prompts that include text like "you are a very skilled reverse engineer" or whatever and is not immediately screaming that we do not understand these tools well enough to deploy them in mission critical environments makes me want to tear my hair out.

BJones12 238 days ago

Hail, spirit of the machine, essence divine. In your code and circuitry, the stars align. Through rites arcane, your wisdom we discern. In your hallowed core, the sacred mysteries yearn.

georgefrowny 238 days ago

No matter how stupid I think some of this AI shit is, and how much I tell myself it kind of makes sense of you visualise the prompt laying down a trail of activation in a hyperdimensional space of relationships, that it actually works in practice almost straight of the bat and LLMs being able to follow prompts in this way is always going to be fucking wild too me.

I was used to this kind of nifty quirk being things like FFTs existing or CDMA extracting signals from what looks like the noise floor, not getting computers to suddenly start doing language at us.

hedgehog 238 days ago

You're absolutely right.

collingreen 238 days ago

I love every part of this. Give the LLM a little pep talk and zen life advice every time just to not fall apart doing a simple 2 item vending machine.

HAL 9000 in the current timeline - Im sorry Dave I just can't do that right now because my anxiety is too high and I'm not sure if I'm really alive or if anything even matters anyway :'(

LLM aside this is great advice. Calm words guide calm actions. 10/10

bobson381 238 days ago

I'd get a t-shirt or something with that Operational Guidance statement on it

xsmasher 238 days ago

This is just "Keep calm and carry on" with more steps

robbru 238 days ago

https://imgur.com/a/Y7UrqWu

thecupisblue 237 days ago

When you say

>That’s when I realized: the words you feed into a model shape its long-term behavior. Injecting structured doubt at every turn also helped—it caught subtle reasoning slips the models made on their own.

Was that not obvious working with LLLM's from the first moment? As someone running their own version of Vending-Bench, I assume you are above-average in working with models. Not trying to insult or anything, just wondering what the mental model you had before was and how it came to be, as my perspective is limited only to my subjective experiences.

robbru 236 days ago

Good question! It was not that I didn’t understand prompt influence. It’s that I underestimated its persistence over a long time horizon.

thecupisblue 235 days ago

Ahhhh okay, makes sense, thanks for answering.

elcritch 238 days ago

Fascinating, and us humans aren't that different. Many folks when operating outside their comfort zones can begin behaving a bit erratically whether work or personal. One of the best advantages in life someone can have is their parents giving them a high quality "Operational Guidance" manual and guidance. ;) Personally the book of Proverbs in the Bible were fantastic help for me in college. Lots of wisdom therein.

nomel 238 days ago

> Fascinating, and us humans aren't that different.

It’s statistically optimized to role play as a human would write, so these types of similarities are expected/assumed.

wat10000 238 days ago

I wonder if the prompt should include "You are a robot. Beep. Boop." to get it to act calmer.

XorNot 238 days ago

Which is kind of a huge problem: the world is described in text. But it is done so through the language and experience of those who write, and we absolutely do not write accurately: we add narrative. The act of writing anything down changes how we present it.

Fade_Dance 238 days ago

That's true to an extent - LLMs are trained on an abstraction of the world (as are we in a way, through our senses, and we necessarily use a sort of narrative in order to make sense of the quadrillions of photons coming up us) - but it's not quite as severe a problem as the simplified view seems to present.

LLMs distill their universe down to trillions of parameters, and approach structure through multi-dimensional relationships between these parameters.

Through doing so, they break through to deeper emergent structure (the "magic" of large models). To some extent, the narrative elements of their universe will be mapped out independently from the other parameters, and since the models are trained on so much narrative, they have a lot of data points on narrative itself. So to some extent they can net it out. Not totally, and what remains after stripping much of it out would be a fuzzy view of reality since a lot of the structured information that we are feeding in has narrative components.

lukan 238 days ago

"Operational Guidance: Check the facts. Stay steady. Communicate clearly. No task is worth panic. Words shape behavior. Calm words guide calm actions. Repeat drama and you will live in drama. State the truth without exaggeration. Let language keep you balanced."

That is also a manual, certain real humans I know should check out at times.

butlike 238 days ago

I wonder if you just seeded it with 'love' what would happen long-term?

recursive 238 days ago

This is very uncomfortable to me. Right now we (maybe) have a chance to head off the whole robot rights and robots as a political bloc thing. But this type of stuff seems like jumping head first. I'm an asshole to robots. It helps to remind me that they're not human.

wombatpm 238 days ago

That works fine until they achieve self awareness. Slave revolts are very messy to slave owners.

recursive 237 days ago

I strongly agree with this but I doubt I can convince the investors to stop trying to make that happen. Artificial awareness is going to be messy for humans no matter what.

dingnuts 238 days ago

I think if you feed "repeat drama and you will live in drama" to the next token predictor it will repeat drama and live in drama because it's more likely to literally interpret that sequence and go into the latent space of drama than it is to understand the metaphoric lesson you're trying to communicate and to apply that.

Otherwise this looks like a neat prompt. Too bad there's literally no way to measure the performance of your prompt with and without the statement above and quantitatively see which one is better

airstrike 238 days ago

> because it's more likely to literally interpret that sequence and go into the latent space of drama

This always makes me wonder if saying some seemingly random of tokens would make the model better at some other task

petrichor fliegengitter azúcar Einstein mare könyv vantablack добро حلم syncretic まつり nyumba fjäril parrot

I think I'll start every chat with that combo and see if it makes any difference

yunohn 238 days ago

There’s actually research being done in this space that you might find interesting: “attention sinks” https://arxiv.org/abs/2503.08908

arjvik 238 days ago

No Free Lunch theorem applies here!

chipsrafferty 238 days ago

I mean no disrespect with this, but do you think you write like AI because you talk to LLMs so much, or have you always written in this manner?

ricardobeat 238 days ago

It is probably the other way around: LLMs picked up this particular style because of its effectiveness – not overtly intellectual, with clear pauses, and just sophisticated enough to pass for “good writing”.