| HN Mirror

I was going to elaborate and say that even though typical columnar databases are already compressed with some variant of dictionary lookup compression, I'd like to see a database engine where large objects (bulk text or binary data) is stored efficiently by default. If I were to wave my hands about, I'd say something like a Merkle or Prolly tree of large ~256KB chunks stored in deduplicated external blob storage, where the individual chunks are compressed with a modern throughput-optimised algorithm.