|
|
|
|
|
by dargscisyhp
1056 days ago
|
|
I thought the AlphaZero paper was pretty cool: https://arxiv.org/abs/1712.01815 Not only did we get a whole new type of Chess engine, it was also interesting to see how the engine thought of different openings at various stages in its training. For instance, the Caro-Kann, which is my weapon of choice, was favored quite heavily by it for several hours and then seemingly rejected (perhaps it even refuted it?!) near the end. |
|
The super cool thing about MuZero is that it learns the dynamics of the problem, i.e. you don't have to give it the rules of the game, which makes the algorithm very general. For example, DeepMind threw MuZero at video compression and found that it can reduce video sizes by 6.28% (massive for something like YouTube)[2][3].
Curious if anyone else knows examples of MuZero being deployed outside of toy examples?
[1] https://arxiv.org/pdf/1911.08265.pdf [2] https://arxiv.org/pdf/2202.06626.pdf [3] https://www.deepmind.com/blog/muzeros-first-step-from-resear...
(edit s/Google/DeepMind)