Hacker News new | ask | show | jobs
The use of embeddings in OpenAI Five (neuro.cs.ut.ee)
62 points by caliber 2841 days ago
4 comments

Some of the comments in the thread are the usual HN contrarianism - "oh, this isn't impressive because X, Y, Z".

This is not the case. What OpenAI have accomplished is extraordinary, ~~especially the part where there is no explicit communication between the bots~~ (not true, see below). They managed to coordinate simply by looking at what the others were doing.

Did they make elementary errors? Yes they did. Did they fail to grasp the strategy of avoiding fights that the humans used? Yes they did. The fact is that even a rank amateur like me (90th percentile when I was playing) wouldn't have made those mistakes. But I still strongly believe that what has been achieved so far is nothing short of extraordinary. Judging by their rate of progress[1] I would not be surprised to see them come back in a few months with these problems figured out, capable of beating the best teams.

[1] - https://twitter.com/OpenAI/status/1037765547427954688

> They managed to coordinate simply by looking at what the others were doing.

This bit seems incorrect, https://medium.com/@stelmaszczykadam/do-openai-five-dota-2-b....

In Dota you have perfect information of the state of all allied units, so I think it's debatable whether sharing input (observation) data between the bots really counts as "communication".

Though that same fact also means communication shouldn't really be necessary; the bots are all exact copies of each other and share a copy of the game state, so they should all have similar ideas of what actions are optimal at any given point in the game.

Going to echo what I said last time before everyone falls for the propaganda. The bots did not play at a top tier level, the bots didn't even play at a level that could be considered median level. They did not discover new mechanics. They couldn't even understand basic mechanics. Their coordination was no more sophisticated than "wander around together and press all the buttons on the first enemy".

They were showing some signs of competitiveness because of their consistently better than human mechanical skills. You can go a long way in Dota2 off solid mechanical skills.

> They couldn't even understand basic mechanics.

Can you expand on this?

Off the top of my head:

- using skills against targets where the skill would have no effect

- using skills on nothing with all plausible targets not remotely nearby

- using items to turn invisible and then immediately taking an action to remove invisibility

- everything regarding vision/detection: placing vision where vision already exists via structures, placing multiple sources of vision on top of each other, buying detection items when none of the opponents could turn invisible, placing detection in areas where detection already exists

- using a item that's canceled by close proximity to enemies in obvious close proximity of the enemy

- waiting for a neutral respawn when its impossible for it to respawn

- stacking effects that don't stack

Then there's obvious poor utilization of skills, such as:

- using a skill that does more damage for every point of missing health on targets that have no missing health

- using ults with long cooldowns on neutrals when that hero's normal skill does more damage with much much lower cooldown

- using skills that multiply it's effectiveness when there are secondary targets when there's obviously no secondary target near the primary target

- casting damage amplification on a target (target takes more damage) and then not dealing damage

You're missing the forest for the trees. Yes, it is true that the bots was suboptimal and made elementary errors, notably around vision. However, overall they did really well. The mere fact that they could coordinate with each other without explicit communication was very impressive.

On each of the poor utilization of skills, I could counter by saying there were other objectives in play. For example, you didn't mention that the "skill that does more damage for every point of missing health on targets" also happens to freeze the target for 1.5 seconds. Perhaps the goal of the bot was just to hold the enemy in place.

Honestly it's pretty easy to sit around saying "yeah nothing impressive here". But I've been playing this game for a decade and fact is, if I had seen OpenAI 5 a year ago I simply would not have believed it was possible for bots to play this well.

> n each of the poor utilization of skills, I could counter by saying there were other objectives in play. For example, you didn't mention that the "skill that does more damage for every point of missing health on targets" also happens to freeze the target for 1.5 seconds. Perhaps the goal of the bot was just to hold the enemy in place.

Using scythe to hold a Bkb'd target in place is a classic use of an immunity piercing stun acknowledging you'll get zero damage. This wasn't the case in the game I watched them play, it was just "a stun". Which is fine, honestly. But they prioritised the stun over the damage and the respawn time increase (which is actually just as important as the damage). I'm not sure I believe they understood all of the aspects of the spell based on how they used it.

The reality remains that the bots are just playing micro level incredible dota, and macro level mediocre dota. There are endless examples. Giving aegis to supports, never stacking creeps for carries, ganking low importance heros, using DP ult to farm jungle.

They're still better than me and probably anyone I'll ever play with. vOv

Sure, it stuns, but it was often stacked with other stuns. I don't remember it ever being used on a low health target. So I'm drawing the conclusion that the bot only understands that the skill stuns, but doesn't understand the scaling damage.

It's impressive that the AI managed to gain a rudimentary understanding of the game almost completely independently since DotA2 is a very complicated game. It's just not interesting problem because game mechanics can easily be codified. It's like being impressed at self-driving cars because the AI learned that it should stop at a red light, it should just be coded to stop at a red light. Learning to stop at a red light isn't the interesting problem that needs solving for self-driving cars.

It's definitely not the "AI is better than humans" narrative that these articles like to push.

They do in fact broadcast information to teammates [1] in recent versions:

> OpenAI Five sends 512 such values every couple of milliseconds

I think one of the challenges to making the bots play realistic Dota would be to limit this passage of information. I would say only one bot can talk every time period (say 1.5 seconds) and minimum time before new information is broadcast is >0.5 seconds.

[1] https://medium.com/@stelmaszczykadam/do-openai-five-dota-2-b...

I think that smart teams are going to extract what they can out of these bots. Even top players had difficulties dealing with the inhumanness of the bots - people are generally trained to compete against something that acts to some degree on instinct, not probability.

The bots lack if fear in e.g. diving a tower before 5 minutes definitely paid off to a certain degree.

I also personally feel that the caster stack of players intentionally lost their matches to hype OpenAI. The limitations of the bots were very apparent, and the pro players at TI easily outsmarted the bots by either:

- Split pushing with 1 hero and fighting with 4.

- Baiting a lower priority hero (e.g. Lion). The AI would very often vastly overcommit for kills in the lategame.

Not even lategame, they overcommitted at all points in the game. Those tower dives for trades early game rarely lead to an objective or advantage. They were able to trade early because superior mechanical advantage matters more early game.

By the time midgame rolled around, it was pretty clear how naive their strategy was. It has an element of surprise to it since it's not a very human strategy, but just because it's not human doesn't make it remotely good.

It's like watching a a car drive on a sidewalk in reverse uphill and honking to avoid pedestrians. It's very impressive that the car figured out that driving on sidewalks reduces collisions with other vehicles, and honking reduces the chance of hitting pedestrians, and it's doing that all while driving in reverse which is very hard for a human to do. But no one in their right mind would call that good driving.

That's a great example. It would be great if you could write a blog post on OpenAI Five. There's a LOT of misinformation on this and could use a treatment like this: https://www.alexirpan.com/2018/02/14/rl-hard.html
> The bots lack if fear in e.g. diving a tower before 5 minutes definitely paid off to a certain degree.

It paid off because the bots were familiar with the 5-courier format, and knew that they could ferry in consumables to recover quickly. This is not normal in DotA 2; the game was balanced around consumables being limited in the early game, and the human players OpenAI faced were not familiar with this new, unbalanced state of affairs.

Lots of OpenAI Five bashing going on in this thread. I propose a gaming-bot-realism Turing test: when a group of rank 100 or better players cannot discriminate between human teams and the bot team by watching the game, only then are the bots playing "real Dota 2".
Great test. The bots won't be able to pass. Both fast reaction times and lack of cohesive strategy will give them away in the first 5 minutes.
> I think it is amazing that one relatively simple mathematical construct can produce such a complicated behavior.

No complicated behavior was produced.

> Or, I don’t know, maybe it says something about the complexity of Dota 2 game?

Dota 2 game was not played. A tiny subset of the game was attempted. Since people seem to be just buying whatever OpenAI propaganda sells them, let me be specific: only 18 heroes are in the game. The combinatorics explode when you go from 18 to 110. Go on a 5x5 board is a joke compared to 19x19 Go. This is not hard to understand.

> Do short term tactics combined with fast reaction time beat long-term strategy?

In a game designed for humans, perhaps yes, because the game wouldn't be tested against extremely fast reaction times. The game was meant to be a strategic game for humans. Just because OpenAI Five appears to play Dota 2 (still doesn't beat any serious players though) doesn't imply anything fundamental about tactics beating strategy.

>> No complicated behavior was produced.

The bots were routinely pulling off coordinated team behavior that players couldn't figure out, but that worked. This has to qualify as complicated.

>> Dota 2 game was not played. A tiny subset of the game was attempted. Since people seem to be just buying whatever OpenAI propaganda sells them, let me be specific: only 18 heroes are in the game. The combinatorics explode when you go from 18 to 110. Go on a 5x5 board is a joke compared to 19x19 Go. This is not hard to understand.

Already mentioned that the hero pool has been opened up. Along with the removal of the other restrictions (invincible courier, items) this is basically pure DotA.

>> Just because OpenAI Five appears to play Dota 2 (still doesn't beat any serious players though) doesn't imply anything fundamental about tactics beating strategy.

The bots beat a team of 5 casters (granted, with some of the older restrictions in place) who are individually in the top 1% of DotA players by MMR.

You're flat out wrong or at the very least inaccurate in all three statements that you made.

>The bots beat a team of 5 casters (granted, with some of the older restrictions in place) who are individually in the top 1% of DotA players by MMR.

And without the 5 couriers the caster team probably would have won.

> The bots were routinely pulling off coordinated team behavior that players couldn't figure out, but that worked. This has to qualify as complicated.

This is just not true. Do you know the game? Are you speaking as a player? Or are you telling us what's written in the OpenAI blog post? There is nothing a player couldn't figure out. The caster team lost because of the broken game and because few of them were rusted (Merlini hadn't played for months). The bots were garbage at the TI, and got beaten without any problem by the pro teams.

> Already mentioned that the hero pool has been opened up. Along with the removal of the other restrictions (invincible courier, items) this is basically pure DotA.

The hero pool is still 18 heroes. Dota 2 has over 110 heroes. Can you please try to think what makes you say something so wrong with so much confidence?

> The bots beat a team of 5 casters (granted, with some of the older restrictions in place) who are individually in the top 1% of DotA players by MMR.

Casters don't count as serious players. Would you regard any sports commentators as good players? Would you put 5 of them randomly in a team and say that's representative of the best players of that sport?

> This is just not true. Do you know the game? Are you speaking as a player? Or are you telling us what's written in the OpenAI blog post? There is nothing a player couldn't figure out. The caster team lost because of the broken game and because few of them were rusted (Merlini hadn't played for months). The bots were garbage at the TI, and got beaten without any problem by the pro teams.

A former player. I'm not regurgitating the blog post, I'm regurgitating what the players themselves said. The AI got beaten by pro teams (top <<1%), but the matches were competitive in the early game, and only later did the bots run into trouble. To give a couple of specific examples of novel behavior, the AI figured out a solid deathball strategy and was able to exploit that to beat a lot of teams, it liberally used fortify to protect creeps and sustain pushes, and it was way more aggressive in rotating its supports to critical lanes in the early game. Now, no single one of those things is entirely novel, but the combination of all of them (especially by a machine that learned it on its own) is what is novel, and what allowed the strategy to be successful.

> The hero pool is still 18 heroes. Dota 2 has over 110 heroes. Can you please try to think what makes you say something so wrong with so much confidence?

Admittedly this was a mistake, the language they used in their blog post was "Removed our last major restriction from what most pros consider 'Real Dota” gameplay'", which is poorly explained and made me think the hero pool was entirely open.

DotA is basically 2 games, the drafting part and the gameplay part. The bots made huge progress in figuring out the gameplay part, which is super impressive.

>> Casters don't count as serious players.

Maybe we're using different language. The casters are definitely in the top 1% of players, or more, which I consider "serious", but not "the best". But no one was arguing that the bots are "the best", which is self-evident from their loss at the International.

Anyways, this is all beside the point. What OpenAI was able to do was really impressive and is only helping to advance the state of reinforcement learning. You argued that there's nothing impressive about what they've done, but I'd love to see you point me to an example of an ML algorithm that learned to play a team game as complex as DotA at a competent level.

At a competent level? None exist. Being better than random doesn't count as competency. As others have said in this thread, it wouldn't even pass as median performance.

> is only helping to advance the state of reinforcement learning

Zero new algorithms or ideas were introduced by OpenAI Five. We just learn that model-free RL doesn't scale and we already knew that from Atari and robotics benchmarks.

> This is just not true. Do you know the game? Are you speaking as a player? Or are you telling us what's written in the OpenAI blog post? There is nothing a player couldn't figure out.

A lot has been made of the 1v1 Shadow Fiend OpenAI player having figured out an edge case of Magic Stick recharging. (Casting spells while outside vision doesn't give charges to the enemy hero.) This mechanic was not previously known to the OpenAI team, but was known to professional players -- it's not the epiphany it's made out to be.

It was a subset in early demos, but not any more: https://blog.openai.com/the-international-2018-results/
This is the kind of misinformation OpenAI is so great at. Otherwise smart people like tlb are led to believe that this was the full game even though it's not even close to the full game. Thanks for helping me make my point.
It's an Elon Musk outfit; misinformation is in its DNA.