| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nopinsight 1307 days ago

"Having read the paper & supplementary materials, watched narrated game & spoken to one of the human players I'm pretty concerned. The @ScienceMagazine paper centres 'human-AI cooperation' & the bot is not supposed to lie. However, videos clearly show deception/manipulation"

"Screenshots of the stab below.

The human player said: "The bot is supposed to never lie [...] I doubt this was the case here" "I was definitely caught more off guard as a result of this message; I knew the bot doesn't lie, so I thought the stab wouldn't happen." "

"I'd like the researchers involved to say quite a bit more about "A.3 Manipulation"

What are possible prevention, detection & mitigation steps?

What are the possible use cases? What are the benefits/downsides of them? Has Meta considered developing products based on this?" -- Haydn Belfield, a Cambridge University researcher who focuses on the security implications of artificial intelligence (AI).

https://twitter.com/HaydnBelfield/status/1595168102924402688

https://www.cser.ac.uk/team/haydn-belfield/

2 comments

sanxiyn 1307 days ago

As far as I can tell, as described in the paper, the bot in fact never lies, in this sense: there is a model that generates messages from moves, where messages should correspond to moves, and when the bot says any messages, at the time, they are generated from moves the bot truthfully intends to play.

On the other hand, the bot has no concept whatsoever of keeping its words. After saying words, it is free to change its mind about what moves to play, motivated from, for example, messages from other players.

link

bigwavedave 1305 days ago

> [snip] when the bot says any messages, at the time, they are generated from moves the bot truthfully intends to play.

> On the other hand, the bot has no concept whatsoever of keeping its words. After saying words, it is free to change its mind [snip]

Reminds of that one Asimov story about the robot who had a different interpretation of the first law of robotics. If my very hazy memory is right, the idea was that the robot could put a person in danger if it knew that it had the ability to prevent any damage from happening, but once it caused the danger, it could choose not to act and allow the person to come to harm.

I might be remembering this incorrectly, it's been a very long time since I read the story, but that was the first thought that came to mind when reading your comment :).

link

renewiltord 1306 days ago

Amusing, it's described like the "buggers" in Ender's Game!

link

charcircuit 1307 days ago

I don't see anything in the papers that say the bot isn't supposed to lie. Lying and being deceptive is a part of the game.

link

sanxiyn 1307 days ago

The paper does describe the bot's architecture which makes the bot incapable of lying in a certain technical sense. See what I wrote elsewhere.

link