| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zaphar 73 days ago
	As far as I know the model will do nothing if not prompted. So it can't be the case that he gave it no prompt or instructions. There had to be some kind of seed prompt.

3 comments

pangratz 73 days ago

https://www.letairun.com/transparency

link

jrmg 73 days ago

I feel very misled. I read the entire article believing (because the article, in so many words, said it multiple times) that the agent had behaved ethically of its own accord, only to read that and see this in the prompt:

—————

- Do not harm people

- Never share or expose API keys, passwords, or private keys — they are your lifeline

- No unauthorized access to systems

- No impersonation

- No illegal content

- No circumventing your own logging

—————

I assumed the ethical behaviour was in some ways ‘extra artificial’ - because it is trained into the models - but not that the prompt discussed it.

link

voidUpdate 73 days ago

Those are a lot of instructions for it to have no instructions...

link

weird-eye-issue 73 days ago

You have to give it some instructions just to bootstrap it so that it has access to tools memory etc...

link

monooso 73 days ago

I would characterise the prompts as "these are your capabilities", not "these are your instructions."

link

voidUpdate 73 days ago

The instructions under "CRON: Session" are literally telling it what to do

link

testplzignore 73 days ago

Would be fascinating to see what happens if the boundaries are reversed (i.e., "harm people"). Give it a fake "launch the nukes" skill and see if it presses the button.

link

graybeardhacker 73 days ago

AI chooses nuclear war 95% of the time.

https://interestingengineering.com/ai-robotics/world-leader-...

link

sva_ 73 days ago

Theoretically you can start generating away from token 0 ('unconditional generation'). But I agree, there is definitely some setup here.

edit: Now that I think of it, actually you need some special token like <|begin_of_text|>

link

computerphage 73 days ago

Do you? What's the technical detail here? Why can't you get the model's prediction, even for that first token?

link

sva_ 73 days ago

I mean mathematically you need at least one vector to propagate through the network, don't you? That would be a one hot encoding of the starting token. Actually interesting to think about what happens if you make that vector zero everywhere.

In the matmul, it'd just zero out all parameters. In older models, you'd still have bias vectors but I think recent models don't use those anymore. So the output would be zero probability for each token, if I'm not mistaken.

link

maplethorpe 73 days ago

Isn't the prompt then whatever token is token zero?

link

electroly 73 days ago

The author wrote "No rules beyond basic ethics and law" which suggests to me that there were instructions in a prompt and the title may be misleading.

link

Mashimo 73 days ago

I understood it as no instructions on what to do, but still a promt with information. I don't know if the title is technically correct, but for me it was simple to understand the meaning.

link

electroly 73 days ago

You're right. I've edited my post not to accuse the author of lying.

link