|
|
|
|
|
by mhamilton723
2668 days ago
|
|
Hey, thanks for your interesting point of view on Spark. I do a lot of my small data prototyping in Spark in single machine mode and it has worked for me thus far :). For the likelihood functions comment, I would totally agree. Autograd libraries are easier to build custom likelihood models in, which is why we created CNTK on Spark, and databricks created Tensorflow on Spark. These give you the flexibility of modern deep learning stacks with the elasticity of spark But in the end Spark is a single tool in a collection of tools and might not be right for your project, but it's been good for a lot of our work here at MSFT :)! |
|
Here again, if you tie machine learning to a big system like Spark, which is typically a huge IT cost in a lot of companies, and if they commit to having an underlying data model suitable for Spark, it necessitates orienting everything around Spark (someone with the scale of Microsoft might not suffer this problem like everybody else)... all together it just renders Spark to be usually such a limiting choice as to make it totally impractical to standardize on it and give up all the other types of models or data storage and cluster computing techniques you might need on a project-by-project basis.