|
|
|
|
|
by gcr
3799 days ago
|
|
Deep learning workloads are typically compute-intensive, but they also tend to be extremely I/O intensive, and convergence may depend on a synchronous step where all the nodes must finish making their contribution to the model before any of them can continue. (This may not be quite true though -- see Google's DistBelief paper--but most frameworks work this way). Often times, adding more machines to a cluster may make training proportionally slower. |
|