|
|
|
|
|
by nfa_backward
3915 days ago
|
|
Kudu is being positioned as filling the gap between HDFS and HBase. After reading the overview I see this more as bringing features from HDFS+Parquet+HBase. Does that sound reasonable? Super excited about this and even more so since it is open source. Thank you! |
|
The idea is to get the analytic scan performance of Parquet while still allowing for in-place updates and row-by-row access like HBase.
HDFS (with Parquet or other formats) will still be better for unstructured or fully immutable datasets. HBase will still be better when your top priority is ingest rate, random access, and semi-structured data. Kudu should be good when you've got tabular data as described above.