Hacker News new | ask | show | jobs
by brianolson 3542 days ago
Another way to make 'JSON' smaller is to instead use 'CBOR', schema compatible 'concise binary object representation'. (See IETF RFC 7049 or http://cbor.io/ ) CBOR encodes and decodes faster too. Or use the 'snappy' compressor.
1 comments

I'm afraid Snappy will not help, at least a lot. PostgreSQL already has a build-in compression (PGLZ). I tested various different algorithms before this shared-dictionary-idea - lz4, bzip and others. Some compress a bit better, other a bit faster, but in general result is almost the same.
Did you test e.g. LZ4 with a prebuilt dictionary? With a good way to find substrings for the dictionary it might generalize well to other kinds of data.
Frankly I don't remember, it was more then half a year ago.

LZ4 is a LZ77 family algorithm which means its dictionary is a "shifting window". I don't believe such kind of dictionary will fit in this case.