Hacker News new | ask | show | jobs
by Eridrus 3590 days ago
Realistically, most people barely get to multiple GPUs, let alone multiple machines. You're more likely to do hyperparameter tuning across machines before you do distributed training.