| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lmm 4409 days ago
	Spark's abstractions are indeed really nice; a Spark job is much more readable than the same thing expressed in raw MapReduce, as that post acknowledges at the end. I can't really comment on or rebut "my code runs slow and I don't know why", except to say that Spark performance has been great when I've used it. But yeah, if the abstraction should fail (and again all I can say is it hasn't for me) then I can imagine it's not much fun to debug performance and there's no distributed profiler (though I think you'd be in much the same boat with vanilla Hadoop).

1 comments

> that Spark performance has been great when I've used it.

Can you say more about your use case? What sort of data did you start with? What did you do with it? How large was the cluster you were running on?

Not sure how much I should say. Advertising analytics. Fairly small cluster (<100). More for ad-hoc theory testing rather than anything regular.