Hacker News new | ask | show | jobs
by CamperBob2 4504 days ago
Over time, I've learned to wget every web page and content archive I want to keep. The Internet forgets.
2 comments

In an earlier age, I ran everything through squid to consolidate browser caches. About five minutes after setting it up, I realised that pulling all the references in the log file and then indexing the lot with htdig would be tremendously useful when I was on the road without internet access.

I spent way too much time pruning stupid crap such as slashdot and started to learn this 'Bayesian classifier' thing.

Your idea is much better.

That's personal use, I have no problem with that. The above project sounds commercial in nature.
That seems pretty presumptuous...