AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code (theverge.com)
	173 points by mtuncer 3074 days ago

9 comments

dpflan 3074 days ago

This reminds me of AI research using NES Games. The AI eventually became proficient at completing Mario levels, and along the way it discovered novel strategies for survival, obtaining points, and finishing levels.

> Check out this timestamp to watch the machine "cheat": https://youtu.be/xOCurBYI_gY?t=9m55s

> Researcher's site about the project: http://www.cs.cmu.edu/~tom7/mario/

> The Paper: The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel...after that it gets a little tricky.: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

pbhjpbhj 3073 days ago

Lol, that Youtube video, at the end the AI pauses the game of Tetris forever so as not to lose.

maxander 3073 days ago

Another case where “the winning move is not to play.”

gkilmain 3073 days ago

I have a friend who I play chess against and every time he's about to lose he offers me a draw. And will continue offering me a draw until I win. Not happy to see the AI performing like that bitch MukyMuky.

notahacker 3073 days ago

In repeat games your friend's strategy might actually be counterproductive, assuming he's playing against people that are rational enough to figure out that he always starts offering a draw round about the time he expects to lose, but not so good at chess that they sometimes don't sometimes miss opportunities for a quick checkmate...

I'd be disappointed if an AI chess computer invariably let me know that I had a better path to winning the game than it did.

gkilmain 3073 days ago

That is exactly what happened. He played a friend of mine whose much lower rated. Eventually the lower rated player found himself in winning position but when the draw was offered he assumed he must be missing something and accepted it. So it worked once.

gowld 3073 days ago

Does that cause an interruption while you are playing?

Do you play on chess.com? That seems to attract unsporting players.

https://www.chess.com/forum/view/general/request-for-disabli...

gkilmain 3073 days ago

Yes, it is on chess.com

Sohcahtoa82 3073 days ago

I'd call it the first example of AI rage-quitting.

dpflan 3073 days ago

Agreed, that may be the cleverest thing the AI did.

iforgotpassword 3074 days ago

This guy. One of my favorite YouTube channels. Releases something like once a year but oh boy, worth the wait. Check it out if you're a nerd and like creative/useless stuff. ;)

atldev 3073 days ago

Well worth the watch. His quote when the AI pauses the game is gold!

ballenf 3073 days ago

Speaking of easy, I spent many an hour playing that Qbert version on Atari and a decent number of quarters spent on the arcade version.

The atari version even on the hard setting was almost fatally dumbed down to be mindless. The enemies were just way dumber than in the arcade version. The game really didn't even feel like Qbert.

With just a little practice, one could play on a single life for as long as desired. Similar to Asteroids on Atari.

personjerry 3074 days ago

> It’s not the most powerful or widely used form of AI at the moment, but it is making something of a comeback. The ability to crack Q*bert could be read as a good omen that evolutionary algorithms are going to be very useful in the future.

Wow that's quite a jump to make

mnx 3074 days ago

This sound like me at the end of every school essay. A forced and over-broad conclusion just to get a "proper" ending.

tclancy 3074 days ago

@#$&%!

andyjohnson0 3074 days ago

The title seems misleading to me. The AI isn't finding bugs by somehow examining the game's source code, it's trying random gameplay and exploiting any advantages that emerge. That it's finding previously unknown bugs seems to be almost entirely down to trying things that human players wouldn't think to do.

tantalor 3073 days ago

You confuse bug (unintended behavior) with its cause (bad code).

BatFastard 3073 days ago

We called them "Unintended features", and they were usually quite popular with users.

montyf 3073 days ago

Exactly what I was thinking. It may be a bug, but the AI treats it as another legitimate game rule. I wonder if there are any techniques for it to be able to tell the difference... for example, if it can quantitatively demonstrate that the conditions for a rule are very rare/unlikely.

pvg 3073 days ago

The AI isn't finding bugs by somehow examining the game's source code

The title doesn't say that.

PeterisP 3073 days ago

It kind of implies that - "AI Cheats at Old Atari Games by Finding Unknown Bugs" would be an accurate title, but the extended "AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code" tells that it's actually finding something in the code, as opposed to simply unexpected/emergent behavior.

pvg 3073 days ago

It's mildly ambiguous (like most things) but it's not misleading or inaccurate. It finds bugs. Which are in the code. It doesn't say what kind of code and it certainly doesn't say it finds them by looking at the source code.

corobo 3073 days ago

I read it and presume it's meant to be read as the AI finding "unknown [bugs in the code]" as opposed to "unknown [bugs] in the code"

soared 3073 days ago

Did you read the article?

>It’s important to note, though, that the agent is not approaching this problem in the same way that a human would. It’s not actively looking for exploits in the game with some Matrix-like computer-vision.

andyjohnson0 3073 days ago

I did. "looking for exploits in the game with some Matrix-like computer-vision." is a fairly meaningless phrase.

bcoates 3073 days ago

Seems like a perfectly obvious metaphor? It's just providing GA-generated input, not doing tooled exploration like afl-fuzz or a symbolic execution analyzer.

It literally-and-metaphorically can't see behind the UI to the internal state of the game.

yorwba 3073 days ago

I haven't read the article yet, and I was not mislead by the title. I guess it helps to be familiar with the way reinforcement learning agents are hooked up to a simulation environment.

nopinsight 3074 days ago

The case is an example of wireheading [1] and illustrates the difficulty of eliciting behaviors we actually desire from complex systems we do not fully understand.

[1] https://wiki.lesswrong.com/wiki/Wireheading

Another lesson: Evolutionary algorithms are really hard to control. Using neural networks developed through evolutionary algorithms means that we are employing a mostly opaque (though not entirely black) box created by a mechanism we can't mentally keep track of in detail. Hope that they are not deployed to control any critical systems until we get a much better grasp of them.

nopinsight 3074 days ago

Has anyone been able to comprehensively state all of essential human values for a general AI to follow? Thankfully, we do not yet have an operational AGI and it is still quite a bit away from reality. (Narrow AIs we are using do not pose much of a problem because they are limited in capabilities.)

raverbashing 3074 days ago

Well how do you say what's cheating or not? It works and it increases the evaluation score

In this case one possible workaround to "cheating" would be to reduce the control precision, add some jittering to control inputs or change the goal function. But I'd say if it's being done solely with using the intended controls it's not cheating (as opposed to changing memory or using a debug 'cheat code').

Still, even in real sports some "cheating" is allowed (see Fosbury Flop)

tboughen 3074 days ago

If it’s not technically cheating, it could be described as gamesmanship.

From Wikipedia https://en.m.wikipedia.org/wiki/Gamesmanship

“Gamesmanship is the use of dubious (although not technically illegal) methods to win or gain a serious advantage in a game or sport. It has been described as "Pushing the rules to the limit without getting caught, using whatever dubious methods possible to achieve the desired end".”

sincerely 3073 days ago

Another term for this is "angle shooting".

mannykannot 3074 days ago

It isn't cheating - as far as the program is concerned, each bug is another rule.

To understand the concept of cheating, and to discuss what is cheating, requires an entirely higher cognitive capability.

NicoJuicy 3074 days ago

I always found this a good project to demonstrate AI :https://xviniette.github.io/FlappyLearning/ ( based on Neuro evolution ) - speed it up for faster results

camgunz 3073 days ago

Can we put AI to work on proving that we live in a simulation? I would never enter/exit my apartment 38 times alternating between forwards, backwards and each side, but an AI would. Maybe then all the walls start flashing and then we'll know!

corobo 3073 days ago

"AI, are you in a simulation?" "Yes" "no I don't mean.. not the simulation I'm running you in, outside of that"

ianferrel 3073 days ago

People with Obsessive Compulsive Disorder are just depth-first-searching for an overlooked maximal strategy.

AnIdiotOnTheNet 3073 days ago

What would it possibly matter? If I told you tomorrow that the entire universe as you know it is running on some extra-dimensional alien computer, how exactly is your life changed? Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

Besides, how would you even tell the difference between a bug in the simulation and legitimate physics? I mean, look at electron tunneling.

shakna 3073 days ago

> Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

My happiness won't change, but I would be excited.

If we are indeed in a simulator, then I would be compelled to create or join an effort to attract the attention of a being outside the simulator. Not for worship, but discourse.

To be able to communicate with something outside of what we had perceived as reality, and would be no less real, would be an amazing opportunity.

AnIdiotOnTheNet 3072 days ago

I admit, that would be exciting and interesting... but also probably impossible. Communication requires shared context, and it is likely whatever our experience of reality is bears no relationship whatsoever to theirs. Imagine Super Mario Bros is a simulation that hosts intelligence. Do you think he interprets data that ultimately becomes pixels on a screen anything like the way we do?

camgunz 3073 days ago

It would literally make the whole of human history a lie, in the same way that Mario never saved the princess and I haven't shot hundreds of ducks.

Semiapies 3074 days ago

So, it's basically working as a goal-oriented fuzzer.

hatsunearu 3074 days ago

Fuzzers are like bug/anomaly/new state finding-oriented reinforcement learning programs, so yeah, in a way :P

tabtab 3073 days ago

So it can become a dirty cheat just like a human. AI is getting more "natural" after all.