|
|
|
|
|
by nomel
4 hours ago
|
|
I suspect it's more that the text data doesn't exist. They're trained on text that was recorded. How often has it been publicly recorded when a nuke was not used, with any context around that lack of use? From the text perspective, it's something that has to be inferred indirectly. If you went through all relevant training data and appended ", we decided not to use a nuke", I suspect the results would be improved. |
|
If a simulation is a shallow head to head conflict between individual actors[1], doesn't set up any payoffs for not escalating[2] or even not nuking, but prompts specify explicit win conditions which are achieved only by hurting the opponent and strongly hint at the importance of nuclear escalation, AIs have little reason not to generate strategies which involve nuclear escalation
[1]I bet if you designed the scenario so ChatGPT had to simulate the war cabinet debates between different personality types and how they sold their decisions to the public, or an entire UN full of nations that might respond, it would have quite different (but probably amusingly erratic in their own way) results.
[2]cf neorealist IR theorists reading Axelrod's papers on computer programs written to win iterated prisoner's dilemma tournaments, which added up all the points accrued from not defecting to conclude winning strategy was definitely TIT-FOR-TAT and not defect first. I'm sure LLMs can win games structured in that way by adopting that strategy too...