Hacker News new | ask | show | jobs
by capkutay 4473 days ago
Can anyone comment on Cassandra's performance versus HBase? HBase adds the complexity of dealing with the whole HStack and clustering can be a pain if you have no need to use HDFS/Zookeeper in the first place. Cassandra seems nice because its a single platform and each node is an equal member of the cluster, no need for designating HMaster, data nodes, etc. I was just wondering if Cassandra's widely regarded as less performant.

I know the true answer lies in my exact use cases and weeks of initial testing, but it would be nice to hear someone's opinion first.

2 comments

Honestly, just google "HBase vs. Cassandra" and go from there.

Just keep in mind that between the two, Cassandra has improved a lot more than HBase has.

HBase can be an easy choice if you already have a Hadoop cluster and want to roll the results of Map-Reduce jobs into HBase keys.

The DataStax crew has a decent stack for turning batch/OLAP jobs into queryable keys.

If you have no need of that, want tunable consistency, favor write availability over read performance, then Cassandra might be a fit.

Just uh, don't pretend Cassandra clusters are necessarily trivial to manage just because they're homogenous.

IMHO: put off moving to any of these technologies as long as possible.

MapR M7 Tables removes much of the complexity and layers of HBase (http://www.mapr.com/products/m7).