Hacker News new | ask | show | jobs
by saidajigumi 4866 days ago
Hive is not particularly fast in and of itself; it just has horizontal scaling and a SQL-ish front-end. Looking at AWS RedShift's homepage[1] (emphasis added):

> Amazon Redshift delivers fast query and I/O performance for virtually any size dataset by using columnar storage technology and parallelizing and distributing queries across multiple nodes.

Column stores databases[2] can be screamingly fast for analytics operations compared to RDBMS or other DB types (ala assorted NoSQL). See Kdb[3] or MonetDB[4] for examples of specific implementations. I'd fully expect a competent column store designed for horizontal scaling to obliterate Hive for a wide range of problems.

The usual big-data caveat: you need to pay attention to the fit of your tools against your problem and your data. I don't expect RedShift to be any different. Still, it's pretty exciting to see a new analysis DB tech cropping up like this. And doubly interesting to see this coming from Amazon.

[1] https://aws.amazon.com/redshift/

[2] https://en.wikipedia.org/wiki/Column-oriented_DBMS

[3a] http://kx.com/kdb-plus.php

[3b] https://en.wikipedia.org/wiki/K_%28programming_language%29#K...

[4] http://www.monetdb.org/Home

1 comments

SAP HANA has a column store, and a row store, and does OLAP (Analytics) and OLTP.

There is a lot of new DB tech, Redshift doesn't seem particularly competitive at the moment unless you only need to use it a portion of the time, where Amazon excels.