Hacker News new | ask | show | jobs
by rayuela 3245 days ago
Have you run anything on multiple-GPUs or scaled to multiple nodes? My biggest hesitation for using pytorch is what appears to be the limited distributed compute support. Being able to easily scale a dynamic graph to arbritrarily large size across a cluster would make pytorch an easy sell for me.
3 comments

No, I haven't done any multi-GPU or multi-node work with PyTorch... at least not yet. So far, I've used PyTorch only for quick-turnaround tinkering and experimentation, and for building prototypes, typically with small datasets or smaller subsets of larger datasets.

For real-world workloads, I, along with my work colleagues, currently use TensorFlow, which has good performance, large community infrastructure, and fantastic tooling around it. If an idea shows promise in PyTorch, our next step is usually to implement it in TensorFlow with more data. But we do a lot of experimental tinkering in TensorFlow too. It depends on the learning task at hand.

Note that this version of PyTorch is the first one to support distributed workloads such as multi-node training.

Well it was literally added this version: http://pytorch.org/docs/0.2.0/distributed.html

Does it meet your needs?

Ooooh this is exciting. Thanks for pointing that out!
Yes, I used multiple-GPUs every day. It works fine under old version and until now I have no experience for the new version.