Y
Hacker News
new
|
ask
|
show
|
jobs
by
dave168
3590 days ago
CNTK is great at scaling out beyond a simple machine. The paper didn't benchmark that but only tested one single box performance.
1 comments
Eridrus
3590 days ago
Realistically, most people barely get to multiple GPUs, let alone multiple machines. You're more likely to do hyperparameter tuning across machines before you do distributed training.
link