|
|
|
|
|
by agibsonccc
2378 days ago
|
|
Your concerns are right on point. I agree that spark is a great sql/etl tool. My thinking was on the "math execution" part. Ray is able to doa bit more there. I do feel like there is a bit of hype riding going on here as well. One interesting thing that could happen is the hardware gets better, and then these distributed schedulers might not be able to keep up with all the different options on the market. There is also the tension of the hardware vendors wanting to give away things that only run on their chips vs the software makers who want things to run on every chip. It seems like there will be a lot of competition among the various infra players in the next few years now that nvidia is starting to have real competition now (even if it's not big yet) |
|
Ray shows expertise in multi-machine that's lacking in stuff like Jax, Tensorflow, and PyTorch. Horovod nailed down a lot of the performance issues for SGD in particular, but is missing the sort of rapid deployment / distribution stuff in Ray. If only they could all work together ...