Hacker News new | ask | show | jobs
by mun2mun 5286 days ago
I have a question. I have read somewhere that map-reduce can leverage parallelism. So if I map a function to an array every element in the array is mapped with that function so that they can be executed parallely because they have no dependency on each other. But how do reduce leverage parallelism? As far as I understand output of the reduce function is dependent on the previous value.
1 comments

In principle, reductions can often be staged, since there's no ordering requirements. Imagine a tree of reductions. But you are correct, the reduce phase is what will limit parallelism. If you have a cheap map operation, but a really expensive reduction, you may not see much scalability. (Where "scalable" is a way of saying "performance improves as available hardware increases because more parallelism inherent in the application is exploitable.")