|
|
|
|
|
by anshumaniax
2092 days ago
|
|
Really we went the exact other way for one of our use cases which required table scans and spark just lost. HBase just stores bytes in sorted order and once you know how to optimize for storage lots of wins can be achieved. I guess the use case here was aggregation so can definitely see some spark advantages. |
|
(Any index scan is extremely fast really. You can build your own indexes as separate Parquet files if you want to avoid HBase for some reason)