|
|
|
|
|
by mmt
2905 days ago
|
|
> MapR has made a business selling more efficient Hadoop-ishness, but also depressing how infrequently I see them deployed, even for pretty massive clusters. I'm only very slightly familiar with their features/value-add and not at all with their pricing. Could the pricing model be particularly unpalatable for some reason? Not that I expect there has to be a deeper reason beyond simply not caring about cost/efficiency. I've certainly both seen and heard described plenty of Hadoop installations that seemed to have missed the "cheap" point in Google's M-R paper and subsequent Hadoop hardware selection advice from, for example, Hortonworks, or misunderstood what it meant. There may also be some misunderstanding of "commodity" or "industry standard" to mean server hardware of a certain "class" (such as brand name or with redundancy features), even if it conflicts with cheapness. Some of it may be that the hardware selection advice articles (e.g. Hortonworks, Cloudera) are very old, with excellent general advice, but potentially misleading specific numbers. Even extrapolating from those numbers in a naive way can easily lead to needless expense and/or sub-optimal performance (that time some Xeons had 3, not 2, not 4, memory channels). The latest article I found in an (admittedly quick) search was https://hadoopoopadoop.com/2015/09/22/hadoop-hardware/ from late 2015, which is still remarkably long ago and is rather verbose. |
|