Hacker News new | ask | show | jobs
by dabmancer 3229 days ago
HTML is repetitive text, and that is what conventional compression algorithms (like the ones you mentioned) are good at. How they work is pretty interesting, so I'd encourage you to go look it up.

The fact that you need to compress HTML makes me think there is a bigger problem, though (People shouldn't make complicated web pages).

1 comments

I know how these work and it's indeed interesting for example PAQ achieves great compression by training multiple neural networks but it's still generic so I thought maybe somebody already has pretrained something similar at large amounts of HTML so it gives really great compression even better than PAQ which has to start from scratch every time...