Hacker News new | ask | show | jobs
by wonnor 1469 days ago
So you have a primary key ("row key") and each column C is essentially its own table containing column C and the primary key column, with the index being on column C. You can see how there is data duplication here, right? Each primary key value is duplicated #columns times.

So we have a mapping from column value to row index. Is there an inverse mapping stored if I want to find all column values contained in a given row?

1 comments

Yes there is duplication of keys since each column has a copy of each key where a value in that column is mapped to that key.

The column (key-value store) maps both ways. It is really fast to find all the keys that are mapped to a given value (or pattern) as well as find all the values that are mapped to a key.

You could say the key is a primary key since each key is unique, but a table can also have a 'Primary Key' column where a unique value can only be mapped to a single key and every row must have a value in that column.

This system is very efficient for sparse tables. Null values are essentially free since the key-value pair is just missing if a column does not have a value for the row. It takes some processing to determine the key is not mapped anywhere in the column, but it is very fast.