|
|
|
|
|
by torginus
71 days ago
|
|
My two cents is LLMs are way stronger in areas where the reward function is well known, such as exploiting - you break the security, you succeed. It's much harder to establish whats a usable and well architected, novel piece of software, thus in that area, progress isn't nearly as fast, while here you can just gradient descent your way to world domination, provided you have enough GPUs. |
|