|
|
|
|
|
by lmm
4409 days ago
|
|
Spark's abstractions are indeed really nice; a Spark job is much more readable than the same thing expressed in raw MapReduce, as that post acknowledges at the end. I can't really comment on or rebut "my code runs slow and I don't know why", except to say that Spark performance has been great when I've used it. But yeah, if the abstraction should fail (and again all I can say is it hasn't for me) then I can imagine it's not much fun to debug performance and there's no distributed profiler (though I think you'd be in much the same boat with vanilla Hadoop). |
|
Can you say more about your use case? What sort of data did you start with? What did you do with it? How large was the cluster you were running on?