Hacker News new | ask | show | jobs
by rspeer 2984 days ago
HTML is pretty repetitive, but if you want to archive HTML data, you don't get to redefine what HTML is. Compression is useful.
1 comments

This is what the WARC [0] file format (and/or gzip) is for.

[0] https://en.m.wikipedia.org/wiki/Web_ARChive

and/or xz? because xz gives better compression than gzip or warc?