| You are right that it should be used the proper tool for each particular problem. And Hadoop world is harder than single machine systems (like pandas). So, you shouldn't user Hadoop if you can do the job with simpler systems. But I have something to add. Hadoop is not only introducing new techniques for distributed storage and computation. Hadoop is also proposing a methodological change in the way a data project is approached. I'm not talking only about doing some analytic over the data, but building an entire data driven system. A good example would be the case of building a vertical search engine, for example for classified ads around the world. You can try to build the system just using a database and some workers dealing with the data. But soon you'll find a lot of problems for managing the system. Hadoop provides you all the storage and processing power that you want (it is matter of money). Why if you build your system in a way where you recompute always everything from the raw input data? That can be seen as something stupid: Why doing that if you can run the system with less resources? The answer is that with this approach you can: - Being human fail-tolerant. If somebody introduces a bug in the system, you just have to fix the code, and relaunch the computation. That is not possible with stateful systems, like those based in doing updates over a database.
- Being very agile in developing evolutions. Change the whole system is not traumatic, as you just have to change the code and relaunch the process with the new code without much impact in the system. That is not something simple in database backed systems. The following page shows how a vertical search engine would be built using Hadoop and what would it be its advantages: http://www.datasalt.com/2011/10/scalable-vertical-search-eng... |
Could you elaborate? This sounds like the NoSQL argument that relational databases are "not agile", which usually means relational databases complain that you have records that won't logically fit the changes you just made.