|
|
|
|
|
by Cameron_D
4602 days ago
|
|
You can probably make your service do both screenshots and WARC, instead of loading a site directly, load it through WARC Proxy (https://github.com/odie5533/WarcProxy), that will write out a WARC file and you can still store your screenshot. Once you have the WARCs you can upload them to Archive.org and they can be added to the wayback, or you can set up your own service for browsing them, built off something like warc-proxy https://github.com/alard/warc-proxy (Yeah, same name different purpose...) There is also a MITM version of WARCProxy that will let you store HTTPS sites: https://github.com/odie5533/WarcMITMProxy |
|
http://www.archiveteam.org/index.php?title=Wget_with_WARC_ou...
This makes creating a browse-able mirror of a site in warc format fairly straightforward, as wget will automatically make links relative, as well as fetch requisite files (css, js, images) for each page.