Hacker News new | ask | show | jobs
by simonw 800 days ago
My hunch is that JSON using a custom compression dictionary with zlib (see zdict argument to https://docs.python.org/3/library/zlib.html#zlib.compressobj) or zstandard would get you most of the benefit while still letting you interact with existing JSON tools. I've not put the work in to prove that to myself though!
2 comments

Labels or other predefined constants being useless, compressing them better is not going to win the argument.

Have a look at the description and performance of a non-toy time series database published 10 years ago:

https://www.vldb.org/pvldb/vol8/p1816-teller.pdf

Convenience of text and json is an argument, but performance??

Yeah that would be an interesting experiment too.

This blog post has some interesting ideas as well: https://www.uber.com/blog/reducing-logging-cost-by-two-order...