Y
Hacker News
new
|
ask
|
show
|
jobs
by
acetabulum
1262 days ago
If you use Horovod Elastic, I think you can avoid this problem working across a cluster of Spot instances.
https://horovod.readthedocs.io/en/stable/elastic_include.htm...