Hacker News new | ask | show | jobs
by limau 3885 days ago
With the parallelism model and abstraction it has to support model/data or hybrid partitioning, and synchronous/asynchronous or hybrid training, it should be easy to extend to GPU cluster. However, training is required only periodically, and if it can be done on existing clusters as efficiently, why not?