|
|
|
|
|
by bkirwi
5008 days ago
|
|
You realize he works at Twitter, right? And that he's responsible for Storm, and Cascalog? I'm not going to say anything better than it was said in the slides / book draft, so I'll just encourage you to take these techniques seriously... they're born out of necessity, not because they sound like fun, and real people are using them to solve problems that are hell to solve any other way. That said, these are not problems that everyone has. If you're not nodding you're head along with the mutability / sharding / whatever complaints at the beginning of the deck, you can probably still get by with a more traditional architecture. (Also, rereading... I should probably note that not everything needs to be kept forever; only the source data, since the views can be recomputed from them at any time. That makes things a bit cheaper.) |
|
'Bigness' of data != data size
'Bigness' of data == data size / budget
Twitter isn't a typical company. I assume they have both a budget and competent management that will let them get away with something like the Lambda architecture.
I reckon it's a lot harder to scale to even a terabyte under the constraints of a grubby setting like a datawarehouse for some instrument monitoring company.
Those guys will allow at best MS SQL for storage, and won't mind putting their developers through hell.