Hacker News new | ask | show | jobs
by Der_Einzige 1069 days ago
Yes, and the only reason it doesn't work is that no one has written truly fast, GPU implementations of them. Don't let anyone here teach you otherwise, even small scale crappy versions (like what I could code in numpy) can successfully solve reinforcement learning problems rather quickly. Nay-sayers might tell you that it doesn't work, but they are wrong. Global optimization is strictly superior to local optimization in general, and we in the AI field are stuck deep in a local minimum right now.

Here's me implementing an algorithm from 2009 in single-core on a CPU and getting pretty excellent results on RLHF benchmarks: https://github.com/Hellisotherpeople/Python-Cooperative-Syna...

1 comments

The whole argument is that at large scales with billions of params it doesn’t matter specifically because of those billions so giving a toy example seems to miss the point.