Hacker News new | ask | show | jobs
by abracadaniel 870 days ago
To add to this, WARC.gz files are also concatenated gzip records, so you can read any record by starting a decompression at a known offset. This gives you the access time of a file with the efficiency of having many many records only taking up one file.
1 comments

WACZ also extends this functionality to allow streaming archives off a server without having to request the whole file to get one page. https://replayweb.page/docs/wacz-format