Hacker News new | ask | show | jobs
by hodgesrm 1532 days ago
> Somehow I've never realized that row oriented storage is orthogonal to how disks work...

The section you posted is very misleading. Storage is arranged in blocks. The secret to database performance is how you lay out data in those blocks and how well your access patterns to the blocks match the capabilities of the device. This choice is the fundamental key to database performance.

If your database stores shopping baskets for an eCommerce site, you want each basket in the smallest number of blocks, ideally 1. It makes inserting, updating, and reading single baskets very fast on most modern storage devices.

If your database stores data for analytic queries, it's better (in general) to store each column as an array of values. That makes compression far better, and also makes scanning single columns very efficient.

To say as the article does that "row oriented is slavishly tied to design ideas of filing cabinets and manila folders" is nonsense. Plus there are many other choices about how to access data that include parallelization, alignment with processor caches, trading off memory vs. storage, whether you have a cost-base query optimizer, etc. Even within column stores there are big differences in performance because of these.

(Disclaimer: I work on ClickHouse and love analytic systems. They are great but not for everything.)