|
|
|
|
|
by Smerity
917 days ago
|
|
As mentioned it's trivial across the spread of compression algorithms supporting this type of behaviour (`gzip`, `zstandard`, `zip`, ...), the header in `zip` making it even more convenient as you note! WARC as a format essentially states that unless you have good reason "record at a time" compression is the preferred[1].
The mixture of "technically possible" and "part of spec" is what makes it so useful - any generic WARC tool can support random access, there are explicit fields to index over (URL), and even non-conforming WARC files can be easily rewritten to add such a capability. [1]: https://iipc.github.io/warc-specifications/specifications/wa... |
|