|
|
|
|
|
by IvanVergiliev
3821 days ago
|
|
Reduce can perform reductions on locally on each machine before shuffling the data. This decreases the memory as well as the network overhead.
If you need all the elements for a given key - e.g. to display them to a user or save them to a DB, perhaps you should use groupBy. If you're going to perform some form of a reduce after that though, it's likely sub-optimal. |
|