Hacker News new | ask | show | jobs
by TheGuyWhoCodes 3780 days ago
mldb looks great. But I was referring to distributed model building, in a horizontal way. Which SparkML does, and TensorFlow says it does. If they can implement a distributed Gradient Boosting Tree across nodes, maybe even with GPU support (Although I'm not sure if it's applicable) that could be huge.
1 comments

Once the open source version of Tensorflow releases multi-node support, this would be one way to make it work. There are potential gains from using a GPU for RF training. As for distributing, in my experience for small models it doesn't make much difference and for larger models the cost of distributing the dataset dominates the benefit from having multiple nodes. But an implementation carefully designed for a given node topology could be made more performant.