| I can accept the other issues, but archivebox needs be private and secure by default. Sending everything to archive.org is bad default value and it erodes a certain level of trust in the project. Requiring "several important changes and security considerations" just makes a non-starter. The default settings should be "safe" for the default user, because as you mentioned in that post, 90% of users are never going to change them. Users should be able to run it locally and archive data without worrying about security issues, unless you only want experts to be able to use your software. Also a contradiction between your statement and your blogpost, someone saving their photos isn't going to be want to worry about whether they configured your tool correctly or leaking all the group logs or grandma's photos. >It's prominently mentioned many times (at least 4) on our homepage that this is the default, and archiving public-only sites (which are already fair game for Archive.org) is a default for good reason. Archiving private content requires several important changes and security considerations. More context > Who cares about saving stuff? > All of us have content that we care about, that we want to see preserved, but privately: > families might want to preserve their photo albums off Facebook, Flickr, Instagram > individuals might want to save their bookmarks, social feeds, or chats from Signal/Discord > companies might want to save their internal documents, old sites, competitor analyses, etc. I want the project to do well but it really needs to be secure by default. |
I 100% agree, but because private archiving is doable but NOT 100% safe yet I cant make that mode the default. The difficult reality currently is that archiving anything non-public is not simple to make safe.
Every capture will contain reflected session cookies, usernames, and PII, and other sensitive content. People don't understand that this means if they share a snapshot of one page they're potentially leaking their login credentials for an entire site.
It is possible to do safely, and we provide ways to achieve that that I'm constantly working on improving, but until it's easy and straightforward and doesn't require any user education on security implications, I cant make it the default.
The goal is to get it to the point where it CAN be the default, but I'm still at least 6mo away from that point. Check out the archivebox/sessions dir in the source code for a look at the development happening here.
Until then, it requires some user education and setting up a dedicated chrome profile + cookies + tweaking config to do. (as an intentional barrier to entry for private archiving)