Hacker News new | ask | show | jobs
by anshumaniax 2092 days ago
Really we went the exact other way for one of our use cases which required table scans and spark just lost. HBase just stores bytes in sorted order and once you know how to optimize for storage lots of wins can be achieved. I guess the use case here was aggregation so can definitely see some spark advantages.
1 comments

Yes, HBase index scans are very fast.

(Any index scan is extremely fast really. You can build your own indexes as separate Parquet files if you want to avoid HBase for some reason)