Hacker News new | ask | show | jobs
by susan_hall 3580 days ago
Space constraints can be a big problem with column keys, at sufficient scale.

Traditional SQL database have column names that are stored separately from the table data, but in JSON, the keys are in the data.

I've worked with MongoDB systems that held 100 terabytes of data. At that scale, we had to re-write all of the keys so they were only a single character. When JSON is small, it is pleasant to have keys that make sense, such as:

username : mary

but as JSON gets big, this needs to change to:

u : mary

If you have 10 million records, the difference (in this trivial example) is 70 million characters, and if you have 100 different collections with an average of 20 keys each, then you are talking about 140 billion characters saved.

Most of the companies that I know, if they work with JSON document stores at scale (such as MongoDB) eventually take steps to make the keys smaller.