Hacker News new | ask | show | jobs
by TallGuyShort 4377 days ago
I don't know much about mllib specifically, but I was expecting to come here and see more comments about Spark - as it does both batch and stream processing relatively well, which allows you to reuse a lot of code between the two pipelines. It seems the primary original motivation was to "beat the CAP theorem" by using different distributed systems that had different characteristics, so this would defeat the point, but like the author I don't think "beating the CAP theorem" this way is going to produce results that warrant the work.