|
|
|
|
|
by gdb
2917 days ago
|
|
The best way we know to think of it is in terms of variance of the gradient. In a hard environment, your gradients will be very noisy — but effectively no more than linear in the duration you are optimizing over, provided that you have a reasonable solution for exploration. As you scale your batch size, you can decrease your variance linearly. So you can use good ol' gradient descent if you can scale up linearly in the hardness of the problem. This is a handwavy argument admittedly, but seems to match what we are seeing in practice. Simulators are nice because it is possible to take lots of samples from them — but there's a limit to how many samples can be taken from the real world. In order to decrease the number of samples needed from the environment, we expect that ideas related to model-based RL — where you spend a huge number of neural network flops to learn a model of the environment — will be the way to go. As a community, we are just starting to get fast enough computers to test out ideas there. |
|
They also understand all of the nuances, similar to HN. Last year when you guys beat Arteezy, everyone grokked that 5v5 was a completely different and immensely difficult problem in comparison. There's a lot of talent floating around /r/dota2, amidst all the memes and silliness. And for whatever reason, the community loves programming stories, so people really listen and pay attention.
https://imgur.com/Lh29WuC
So yeah, we're all rooting for you. Regardless of how it turns out this year, it's one of the coolest things to happen to the dota 2 scene period! Many of us grew up with the game, so it's wild to see our little mod suddenly be a decisive factor in the battle for worldwide AI dominance.
Also 1v1 me scrub