Hacker News new | ask | show | jobs
by sillysaurusx 2334 days ago
It's not that slow. And you can use many TPUs together to make up the speed difference.
1 comments

If that were the case I am wondering why anyone would buy GPUs? I invite you to retrain a state of the art model of your choice on a CPU and see how far you get.
We fine-tuned GPT-2 1.5B for subreddit simulator using this technique. https://www.reddit.com/r/SubSimulatorGPT2Meta/comments/entfg...