Hacker News new | ask | show | jobs
by alper111 2852 days ago
If you look at network structure, it acts as one agent, not five. So, free coordination. (See: https://t.co/GPKHPsIu1C)

In my opinion, what i see is a very good player who knows how to chain stun precisely without any strategic depth. If you claim to have built an AI system, which you ultimately want it to evolve to AGI, you at least expect some sort of strategic decision making at the macro level. Though since it has almost perfect micro, it can easily outweight the most of teams. So yeah, with that expectation I see this as a joke, too.

P.S. The model is trained with 128k cpus and 256 gpu. It is able to play 180 years worth of game in a day. Think about it.

1 comments

It's five independent agents. The article on OpenAI's website and the network structure both say this. I'll zoom in since it's a complicated structure.

It's the first line of the article: https://blog.openai.com/openai-five/

>Our team of five neural networks,

They use a hyperparameter called team spirit to cooperate. I don't think the goal of this is AGI at all, so I don't see why people are making that leap. But sure, for the geniuses of HN this must clearly be trivial.

It's not independent agents. The neural networks have the same input, share weights, and also share some activations. With that much sharing, it's better to think of it as one neural network which has output heads for all the 5 players. So coordination is free. Actually, coordination makes as little sense as saying that multiple neurons in a neural network are cooperating, or that the two legs of a humanoid are cooperating to walk. Further there is no game being played between heroes of the same team. They literally have the same objective. The "coordination" buzzword is just another attempt by OpenAI to confuse and mislead readers, and give a false sense of their progress.
They cannot share the same inputs unless the team spirit hyperparameter is exactly 1, which it is not. You are partially correct in that the agents consume the parameters of the four other agents, but it is weighted differently accorsing to team spirit parameters.
The team spirit hyperparameter is a crutch they've introduced themselves. Ideally it should be one. In Dota there is only one objective for the entire team and that's to win the game. The fact that they shape rewards is an implementation detail and doesn't change the fact that Dota 2 does not require cooperation, because there's no cooperation game being played. It's a purely zero-sum adversarial game being played between two teams.