Hacker News new | ask | show | jobs
by pella 1 day ago
OpenZL is the future: https://openzl.org/

  "OpenZL delivers high compression ratios while preserving high speed, a level of performance that is out of reach for generic compressors. OpenZL takes a description of your data and builds from it a specialized compressor optimized for your specific format."
2 comments

OpenZL is nice, but it's often less useful than you think - it requires that you know the structure of your data, and don't care about inspecting that data outside of your program. I've extracted one too many png files from a word document (by renaming .docx to .zip) to desire OpenZL everywhere... It might be better as a short-term "data in transit" compression than for long term storage.
Please check the OpenZL v0.2 + Silesia corpus benchmark.

  "OpenZL to offer 10% faster compression speed and 70% faster decompression speed compared to Zstandard level 1 on the Silesia corpus in our benchmarks."
  "OpenZL now ships its own LZ codec, exposed as ZL_GRAPH_LZ, and the serial profile in zli. It is still being actively developed to expand its feature set and improve performance on small inputs."
https://github.com/facebook/openzl/releases/tag/v0.2.0
The future may be ~ AI-assisted format detection + OpenZL

(~ OpenZL-AI-LLM recognises the data structure, then guides OpenZL toward the best lossless compression path )

  "The unreasonable effectiveness of our first foray into training leads us to believe that the graph model is uniquely  positioned to facilitate ML-guided generation of compressors. We are tempted to view this as “the next big thing” in production-scale compression. Whereas compression research has up to now eluded those without domain expertise, we believe the future of application-specific compressors will be unlocked via investment in automated learning methods."
https://arxiv.org/abs/2605.09928 [11 May 2026] OpenZL: Using Graphs to Compress Smaller and Faster
1. Upload data using conventional compression method (or uncompressed)

2. Spend orders of magnitude (literally) more on compute to run the LLM on the data than any compression algorithm would ever take.