Hacker News new | ask | show | jobs
by labrador 1001 days ago
I've used HTTrack before. It's handy for downloading entire websites. The resulting pile of html can be converted to text or PDF later for processing

https://www.httrack.com/

2 comments

HTTrack - a name I hadn't heard in 15 years may be. I used it back then. However, I want to clarify that the thing I set out to build is not about entire site backup, but more of site URL and some metadata, but with local-first approach and self-hostable.
How do things like these cope with client-side render websites?

Man, do I miss back when you could “save as webpage, complete” and generally get a working copy. I saved webpages a lot because after all, you might not always be online in those days!