Hacker News new | ask | show | jobs
by clickwiseorange 625 days ago
Good question. It's not just ibm14, but everything people outside Google tried shows that RL is much worse than prior methods. NVDLA, BlackParrot, etc. There is a strong possibility that Google pre-trained RL on certain TPU designs then tested in them, and submitted to Nature.