Hacker News new | ask | show | jobs
by exo762 3540 days ago
It would be nice to see a comparison to a general compression algorithm - e.g. deflate.
1 comments

There is a comparison table at the end of the project's README (https://github.com/afiskon/zson). However, the table columns are not explained very well. It is not 100% clear to me whether "before" means "uncompressed" or "PGLZ compressed".

  Compression ratio could be different depending on documents,
  database schema, number of rows, etc. But in general ZSON
  compression is much better than build-in PostgreSQL
  compression (PGLZ):

     before   |   after    |      ratio       
  ------------+------------+------------------
   3961880576 | 1638834176 | 0.41365057440843
  (1 row)
  
     before   |   after    |       ratio       
  ------------+------------+-------------------
   8058904576 | 4916436992 | 0.610062688500061
  (1 row)
  
     before    |   after    |       ratio       
  -------------+------------+-------------------
   14204420096 | 9832841216 | 0.692238130775149
Frankly I don't remember all details since I did this benchmark in February. IIRC its "ZSON + PGLZ" vs "JSONB + PGLZ".

Please note that everything depends very much on your data. PostgreSQL is smart about what to compress and what not. In general it could be all combinations of "ZSON +/- PGLZ" vs "JSONB +/- PGLZ".

Don't believe any benchmark I or anyone else did. Re-check everything on your data, configuration, hardware, workload, etc.