|
|
|
|
|
by matt_morgan
4085 days ago
|
|
Do you know what I found out over the last few days? There's no simple tool that you can use to download the actual content of your website, you know, for migrating it to a new CMS or whatever. Unbelievable. Something that will just run a text extraction through `wget -r` and save it all. Boilerpipe does the extraction nicely, but nobody has turned it into a simple tool. You just have to have a job and try to get stuff done for a while and this kind of thing comes up. Just wait and watch. |
|
If you're talking about the source for dynamic pages, you can use any file copier like rsync. But httrack is your go-to if you're just talking about downloading a web site mirror image.