Hacker News new | ask | show | jobs
by zippie 5124 days ago
Would like to emphasize that this is only really useful in environments where gzip is not available (as the OP notes)...some tests using the demo JSON (minified):

test.json = 285 bytes

test.rjson = 233 bytes (18%)

test.json.gz = 205 bytes (27%)

If you are able to bundle a RJSON parser, why not just bundle an existing, well understood/tested compression scheme such as http://stuartk.com/jszip/ or https://github.com/olle/lz77-kit/blob/master/src/main/js/lz7... instead?

2 comments

An arithmetic coding scheme which has a model based on the probabilities found in JSON abstract syntax trees would significantly improve on most typically used generic compression schemes. Arithmetic coding schemes have largely been avoided thus far due to patents which have recently expired, if I remember correctly.

using the order 2 precise model on this page I get 190 bytes-- and that is still a generic non-json model. http://nerget.com/compression/

This - JSON specific compression schemes aren't going to yield gains over AST friendly schemes unless the JSON serialization specification changes significantly.

Along these lines - shipping a schema with the data payload is avro-like ... which is also questionable in terms of efficiency when compared with gzip/LZO.

They are using gzip compression level 1. Bogus.
Are you referring to the graph, in which they set the gzip compression as "1" in order to clearly show the ratio of compression improvement that their technique has over gzip?
And if you used gzip on a file, is has some overhead (the 10-byte gzip header) and a freshly initialized deflate state. Usually, compression improves when more data is seen, since the dynamic Huffman tree improves and there are more blocks for LZ77 to backreference.