|
|
|
|
|
by equark
5028 days ago
|
|
Another key fact is that "big data" is actually not that common, especially when it gets to the analysis stage. The median job size at Microsoft and Yahoo is only 15GB. And 90% of Hadoop jobs at Facebook are under 100GB. Clearly you want to be able to crunch large log files, but in terms of day-to-day analysis the files are much smaller than that. (cite: http://research.microsoft.com/pubs/163083/hotcbp12%20final.p...). At Sense (http://www.senseplatform.com) most of the clients we work with are struggling not with the size of their data but with tricky modeling problems that don't fit into standard black boxes and with integrating analytics into actual production systems. Adopting something like Hadoop for these tasks is not very productive. |
|
I'll be keeping this pdf in my "rebuttals to idiocy" folder.
There are some industries that certainly have do have "big data" (Wikipedia has some definitions for "big data" that include size ranges for whatever that's worth) but it does not seem like companies with "big data" are the only targets of "big data" marketing. And from what I know about available solutions, if I really had a "big data" problem (e.g., 100 terabytes not 100 gigabytes) then I would not be choosing Hadoop. I also would not choose SQL or "NoSQL". But that's just me. Some of the best solutions I've found have nearly zero marketing. Go figure.