Hacker News new | ask | show | jobs
AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code (theverge.com)
173 points by mtuncer 3027 days ago
9 comments

This reminds me of AI research using NES Games. The AI eventually became proficient at completing Mario levels, and along the way it discovered novel strategies for survival, obtaining points, and finishing levels.

> Check out this timestamp to watch the machine "cheat": https://youtu.be/xOCurBYI_gY?t=9m55s

> Researcher's site about the project: http://www.cs.cmu.edu/~tom7/mario/

> The Paper: The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel...after that it gets a little tricky.: http://www.cs.cmu.edu/~tom7/mario/mario.pdf

Lol, that Youtube video, at the end the AI pauses the game of Tetris forever so as not to lose.
Another case where “the winning move is not to play.”
I have a friend who I play chess against and every time he's about to lose he offers me a draw. And will continue offering me a draw until I win. Not happy to see the AI performing like that bitch MukyMuky.
In repeat games your friend's strategy might actually be counterproductive, assuming he's playing against people that are rational enough to figure out that he always starts offering a draw round about the time he expects to lose, but not so good at chess that they sometimes don't sometimes miss opportunities for a quick checkmate...

I'd be disappointed if an AI chess computer invariably let me know that I had a better path to winning the game than it did.

That is exactly what happened. He played a friend of mine whose much lower rated. Eventually the lower rated player found himself in winning position but when the draw was offered he assumed he must be missing something and accepted it. So it worked once.
Does that cause an interruption while you are playing?

Do you play on chess.com? That seems to attract unsporting players.

https://www.chess.com/forum/view/general/request-for-disabli...

Yes, it is on chess.com
I'd call it the first example of AI rage-quitting.
Agreed, that may be the cleverest thing the AI did.
This guy. One of my favorite YouTube channels. Releases something like once a year but oh boy, worth the wait. Check it out if you're a nerd and like creative/useless stuff. ;)
Well worth the watch. His quote when the AI pauses the game is gold!
Speaking of easy, I spent many an hour playing that Qbert version on Atari and a decent number of quarters spent on the arcade version.

The atari version even on the hard setting was almost fatally dumbed down to be mindless. The enemies were just way dumber than in the arcade version. The game really didn't even feel like Qbert.

With just a little practice, one could play on a single life for as long as desired. Similar to Asteroids on Atari.

> It’s not the most powerful or widely used form of AI at the moment, but it is making something of a comeback. The ability to crack Q*bert could be read as a good omen that evolutionary algorithms are going to be very useful in the future.

Wow that's quite a jump to make

This sound like me at the end of every school essay. A forced and over-broad conclusion just to get a "proper" ending.
@#$&%!
The title seems misleading to me. The AI isn't finding bugs by somehow examining the game's source code, it's trying random gameplay and exploiting any advantages that emerge. That it's finding previously unknown bugs seems to be almost entirely down to trying things that human players wouldn't think to do.
You confuse bug (unintended behavior) with its cause (bad code).
We called them "Unintended features", and they were usually quite popular with users.
Exactly what I was thinking. It may be a bug, but the AI treats it as another legitimate game rule. I wonder if there are any techniques for it to be able to tell the difference... for example, if it can quantitatively demonstrate that the conditions for a rule are very rare/unlikely.
The AI isn't finding bugs by somehow examining the game's source code

The title doesn't say that.

It kind of implies that - "AI Cheats at Old Atari Games by Finding Unknown Bugs" would be an accurate title, but the extended "AI Cheats at Old Atari Games by Finding Unknown Bugs in the Code" tells that it's actually finding something in the code, as opposed to simply unexpected/emergent behavior.
It's mildly ambiguous (like most things) but it's not misleading or inaccurate. It finds bugs. Which are in the code. It doesn't say what kind of code and it certainly doesn't say it finds them by looking at the source code.
I read it and presume it's meant to be read as the AI finding "unknown [bugs in the code]" as opposed to "unknown [bugs] in the code"
Did you read the article?

>It’s important to note, though, that the agent is not approaching this problem in the same way that a human would. It’s not actively looking for exploits in the game with some Matrix-like computer-vision.

I did. "looking for exploits in the game with some Matrix-like computer-vision." is a fairly meaningless phrase.
Seems like a perfectly obvious metaphor? It's just providing GA-generated input, not doing tooled exploration like afl-fuzz or a symbolic execution analyzer.

It literally-and-metaphorically can't see behind the UI to the internal state of the game.

I haven't read the article yet, and I was not mislead by the title. I guess it helps to be familiar with the way reinforcement learning agents are hooked up to a simulation environment.
The case is an example of wireheading [1] and illustrates the difficulty of eliciting behaviors we actually desire from complex systems we do not fully understand.

[1] https://wiki.lesswrong.com/wiki/Wireheading

Another lesson: Evolutionary algorithms are really hard to control. Using neural networks developed through evolutionary algorithms means that we are employing a mostly opaque (though not entirely black) box created by a mechanism we can't mentally keep track of in detail. Hope that they are not deployed to control any critical systems until we get a much better grasp of them.

Has anyone been able to comprehensively state all of essential human values for a general AI to follow? Thankfully, we do not yet have an operational AGI and it is still quite a bit away from reality. (Narrow AIs we are using do not pose much of a problem because they are limited in capabilities.)
Well how do you say what's cheating or not? It works and it increases the evaluation score

In this case one possible workaround to "cheating" would be to reduce the control precision, add some jittering to control inputs or change the goal function. But I'd say if it's being done solely with using the intended controls it's not cheating (as opposed to changing memory or using a debug 'cheat code').

Still, even in real sports some "cheating" is allowed (see Fosbury Flop)

If it’s not technically cheating, it could be described as gamesmanship.

From Wikipedia https://en.m.wikipedia.org/wiki/Gamesmanship

“Gamesmanship is the use of dubious (although not technically illegal) methods to win or gain a serious advantage in a game or sport. It has been described as "Pushing the rules to the limit without getting caught, using whatever dubious methods possible to achieve the desired end".”

Another term for this is "angle shooting".
It isn't cheating - as far as the program is concerned, each bug is another rule.

To understand the concept of cheating, and to discuss what is cheating, requires an entirely higher cognitive capability.

I always found this a good project to demonstrate AI :https://xviniette.github.io/FlappyLearning/ ( based on Neuro evolution ) - speed it up for faster results
Can we put AI to work on proving that we live in a simulation? I would never enter/exit my apartment 38 times alternating between forwards, backwards and each side, but an AI would. Maybe then all the walls start flashing and then we'll know!
"AI, are you in a simulation?" "Yes" "no I don't mean.. not the simulation I'm running you in, outside of that"
People with Obsessive Compulsive Disorder are just depth-first-searching for an overlooked maximal strategy.
What would it possibly matter? If I told you tomorrow that the entire universe as you know it is running on some extra-dimensional alien computer, how exactly is your life changed? Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

Besides, how would you even tell the difference between a bug in the simulation and legitimate physics? I mean, look at electron tunneling.

> Is it any more or less meaningful? Will your suffering be any less painful, your happiness any less joyful?

My happiness won't change, but I would be excited.

If we are indeed in a simulator, then I would be compelled to create or join an effort to attract the attention of a being outside the simulator. Not for worship, but discourse.

To be able to communicate with something outside of what we had perceived as reality, and would be no less real, would be an amazing opportunity.

I admit, that would be exciting and interesting... but also probably impossible. Communication requires shared context, and it is likely whatever our experience of reality is bears no relationship whatsoever to theirs. Imagine Super Mario Bros is a simulation that hosts intelligence. Do you think he interprets data that ultimately becomes pixels on a screen anything like the way we do?
It would literally make the whole of human history a lie, in the same way that Mario never saved the princess and I haven't shot hundreds of ducks.
So, it's basically working as a goal-oriented fuzzer.
Fuzzers are like bug/anomaly/new state finding-oriented reinforcement learning programs, so yeah, in a way :P
So it can become a dirty cheat just like a human. AI is getting more "natural" after all.