|
|
|
|
|
by quantumspandex
497 days ago
|
|
AlphaGo seems more like an automated process to me because you can start from nothing except the algorithm and the rules. Since a Go game only has 2 outcomes most of the time, and the model can play with itself, it is guaranteed to learn something during self-play. In the LLM case you have to have an already capable model to do RL. Also I feel like the problem selection part is important to make sure it's not too hard. So there's still much labor involved. |
|