|
|
|
|
|
by monstrado
4380 days ago
|
|
Although I have a lot of respect for the amplab, they did not do their due diligence with that benchmark. Mainly for a few reasons, they didn't test using columnar storage in Hadoop (ORC / Parquet), which is what Redshift is using underneath (a proprietary columnar store). Also, the most complicated query they ran was a two table join, and from what I can tell, there wasn't any concurrent workload testing. (disclaimer: I'm a Cloudera employee): I recommend checking out the following blog, not because my employer wrote it, but because the guys behind the benchmark did an incredible job making the benchmark competitive. They also show metrics that a lot of the other people are not showing, for example concurrent workload capabilities, CPU efficiency, etc. Impala, Hive (on Tez), Shark, Presto http://blog.cloudera.com/blog/2014/05/new-sql-choices-in-the... |
|