Hacker News new | ask | show | jobs
by brandmeyer 1693 days ago
Maybe this is a silly question: Why is the A/B choice between a row-major database and a column-major database, instead of between row-major tables and column-major tables within a flexible database?

What's stopping the other leading brands from implementing columnar storage, queries, and such with a COLUMN MAJOR table attribute?

2 comments

Some databases do offer both, but it is much more involved than just changing the storage model. The entire query execution model needs to adapt to columnar execution. You can simulate a column store model in a row-store database by splitting a table into a series of single-column tables, but the performance benefits you will capture are much smaller than a system that is designed and optimized for column store execution.
SQL calculations on columnar data are quite different from row-based databases, so its effectively a different database engine. You can take multiple advantages of columnar data store, because it usually employs a form of vocabulary compression. For instance, obtaining distinct values of a field in a columnar DB is much faster because it's typically just the vocabulary of the field, so it doesn't even require a full table scan. Many other columnar computations such as filtering or aggregation can be done on compressed data without decompression.