|
|
|
|
|
by mystique
4100 days ago
|
|
MapReduce and RDBMS are apples and oranges - both are good at what they do and are effective within their own use cases. One allows you to handle any type of data and manage it whichever way, another allows you to understand your data if you can live within some defined structure. It is silly to suggest to use MapReduce to power a dashboard with sub second response time. Same way, it is silly to suggest using MPP or RDBMS like techniques for processing highly unstructured or even semi structured content. Apache Spark is getting close to being able to do both, but still as a developer building a data stack, I would not inspect terabytes of data every single time if 80% of questions can be answered by looking at data once and saving summarized results in relational format. I thought Hadoop vs RDBMS was a fight settled may be 4-5 years ago! Amusing to see it being raised at this time. |
|
On the other hand, we both build and use summary tables with Spark. (in a relational format to boot, and using Spark SQL).
I think you would benefit from re-evaluating the assumptions you made 4-5 years ago.