|
|
|
|
|
by atombender
884 days ago
|
|
That's my understanding. MR is very simplistic and awkward/impossible to express many problems in, whereas dataflow processors like Spark and Apache Beam support creating complex DAGs of rich set of operators for grouping, windowing, joining, etc. that you just don't have in MR. You can do MR within a DAG, so you could say that dataflows are a generalization or superset of the MR model. |
|
I think it's the opposite of this. MapReduce is a very generic mechanism for splitting computation up so that it can be distributed. It would be possible to build Spark/Beam and all their higher level DAG components out of MapReduce operations.