Hacker News new | ask | show | jobs
by ntoshev 3000 days ago
I'd like to be able to load log files in a database and have it take the same space as compressed log file size, not the uncompressed size (which can easily be 20x more). I guess this requires database built on compact data structures or something similar, but surely simple support for dictionary compression for database text fields would help? Text fields are small chunks of text of the same type, so they should be amenable to good compression with the same dictionary. I've searched around and found some support in RocksDB, but that's pretty much it.

Most compression libraries offer dictionary support, although it's somewhat obscure. For example, there is no method in zlib to actually create your dictionary. Bindings and higher level libraries often ignore the dictionary support.