|
|
|
|
|
by 7777777phil
56 days ago
|
|
AlphaZero worked because chess and Go have terminal rewards and positions you can prove are right or wrong. General intelligence has neither, and the leap from self-play in a well-defined game to self-play in arbitrary environments is the hard part Silver isn't really demoing. Sara Hooker's stuff on scaling laws lines up here (1) (1) https://philippdubach.com/posts/the-most-expensive-assumptio... |
|