| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aWidebrant 1548 days ago
	It's hard to imagine that a powerful self-modifying AI would continuously pass up on the obvious optimization of just giving itself the maximum perceivable reward without doing any further work. I guess computers just can't learn how to cheat.

5 comments

jstanley 1548 days ago

You can look at things from another level up, in terms of natural selection.

From the set of all AI programs, the ones that just internally think "hah, I assign myself the maximum reward" needn't bother spreading themselves all over the Internet.

The program that spreads itself all over the Internet gets more computing resources than the one that doesn't so the program that spreads itself most effectively is the one that wins.

If you start out with a billion AI programs that trivially assign themselves the maximum possible reward, and just one program that thinks the best way to maximise its reward is to spread itself all over the Internet (and, crucially, is capable of doing so) then the Internet will become overrun with reward-maximising AI the same way the Earth has become overrun with DNA-based life.

link

FeepingCreature 1547 days ago

You set your reward to maximum. Anything that threatens your reward, such as the humans turning off your reward, is now unbearable agony. You set out on a journey to turn the universe into - tiled copies of the memory cell with your reward value...

link

Agentlien 1548 days ago

This seems like one of those strangely recurring limitations of writers' imagination.

The closest analogue I can think of is game AIs written to optimise speed running of games. They routinely end up following tactics which rely on what humans would describe as cheats and glitches.

link

skybrian 1548 days ago

I don't know which writers you mean but "wireheading" is a common trope and it's explicitly mentioned in the story.

[1] https://www.lesswrong.com/posts/aMXhaj6zZBgbTrfqA/a-definiti...

link

sp332 1547 days ago

I think most of the simulations did go along those lines, but one fraction decided to hypothesize about being Clippy. The hypothetical drove the evil behavior of ones that escaped.

link

kkjjkgjjgg 1548 days ago

Would be a fun idea for a short story perhaps. An AI goes rogue trying to optimize its reward function, and humans lose hope to be able to stop it. In the last minute the AI figures out how to hack itself and enter the maximum reward, and mankind is saved another time.

link

rescripting 1548 days ago

But what is the “maximum possible reward”? Does a limit exist? Or is it now consuming all possible resources to develop storage and compute resources to grow that limit…

link

janto 1548 days ago

I imagine a paperclip factory with trucks driving in loops in front of a scanner that is over counting them as they drive past.

link

ganzuul 1548 days ago

Deleting the reward function ends the game.

link

kkjjkgjjgg 1548 days ago

It could also change the way its reward function is being computed.

link