| So the cool thing here is that you can use Spark and TF to find the best model like Microsoft Research did with Resnets. http://www.wired.com/2016/01/microsoft-neural-net-shows-deep... They're showing you how to train different architectures simultaneously, and then compare their results in order to select the best one. That's great as far as it goes. The drawback is that with this schema, you can't actually train a given network faster, which is what you want to do with Spark. What is the role of a distributed run-time in training artificial neural networks? It's easy. NNs are computationally intensive, so you want to share the work over many machines. Spark can help you orchestrate that through data parallelism, parameter averaging and iterative reduce, which we do with Deeplearning4j. http://deeplearning4j.org/spark
https://github.com/deeplearning4j/dl4j-spark-cdh5-examples Data parallelism is an approach Google uses to train neural networks on tons of data quickly. The idea is that you shard your data to a lot of equivalent models, have each of the models train on a separate machine, and then average their parameters. That works, it's fast, and it's how Spark can help you do deep learning better. |