Could be! However, making a direct attack on individual privacy should never have been an option. To make matters worse, the logic of, "We did this to government and military websites, so now we're going to roll it out everywhere" was quite broken for the time and remains so.
There's examples of how this works in a healthy way. Martin Manley is one scenario that comes to mind, where he overtly opted-in to having an archive stored about him upon his death: https://martin-manley.eprci.com/
Neiher 'flaunting privacy' nor 'direct attack on individual privacy' are fair descriptions of any of the Archive's web collection policies.
People who freely publish information, to the worldwide public, on the 'World Wide Web' should reasonably expect all sorts of entities to collect, save, analyze, & repurpose that info, unless they take specific steps to discourage such access & use.
The Archive's crawlers identify themselves, and collect things that are publicly linked, or specifically nominated-for-collection by library patrons or partners. Except in some focused specialized collection projects, they don't "log in" as any user, only visiting & collecting what's published freely to any anonymous person/organization/process.
For material needing more privacy, websites always have the option to block any and all unwanted visitors/crawlers with a wide variety of standard techniques, like requiring logins or simple challenges that automated crawlers won't pass.
And, as your linked articles report, the process for a later exclusion by request is pretty quick and simple. (The 2nd post concludes: "So, hats off to the Internet Archive for making the process smooth and relatively painless.") And, such exclusion does not require any sort of "DMCA request".
This is victim blaming. In my jurisdiction, you retain copyright under any information you publish, even to the worldwide public. This means I can reasonably expect entities to collect, save, analyze and repurpose that info within reason, and without specific steps to discourage access & use. This is why there are laws such as 'fair use' and 'satire', because we wanted to extend what is considered reasonable use of public works. But redistributing copyrighted works without permission? Legally actionable, if you have the money and lawyers and access to the necessary courts. If this was software, such as free software license violations, people in this forum would be calling for the lawyers to nuke them from orbit.
Thankfully DMCA should make the removal process easier now, especially in situations where control over the domain has been lost or being hosted by a third party. Although last I saw there were still artificial barriers, such as needing to list every single individual page needing to be taken down. But this is after the fact, after you discovered your reasonable expectations and privacy have been violated. And then you have to track down the other copies that IA illegally distributed your now-private and copyrighted information to, such as a few libraries around the world with similar projects.
I'm talking about the unfair allegation of privacy violations, here.
Note that when the Archive shares crawled content with other libraries, those other libraries often have their own legal right to collect, preserve, and make-available that data even stronger than the Archive's rights via fair use, implied-license, library privileges, and other grounds. For example, many of the Archive's partners in government libraries, archives, & educational institutions have a statutory right & mission to collect copies of everything 'published', including via the world-wide-web, in their sphere of national interest.
As to what some unstated jurisdiction might consider "within reason", I prefer to think they'll find what's reasonable what I find to be reasonable – the IA's crawling policies – unless & until some actual governing authority finds otherwise in a clearly applicable/legible decision.
See my root post (ggggggp): in a vital, evolutionary, true-law-made-on-the-ground civilization, what actually winds up as "within reason" depends on the real implementations & multi-decade demonstrations of how things can beneficially work, as much or more than any copyright loyalist's strict reading of older statutory laws.
Crawling and archiving everything, including personal writings, is a chilling effect. It is the same situation people are seeing with social media, where the past remains to haunt the present and none of our future leaders are using it without a mask. It was most surprising to people when some Libraries decided 'published' meant anything put on the WWW or posted to Usenet. It seemed grasp for funding and to keep relevant in an age where information was moving out of published media and into opinions virtually scrawled on a toilet door. The stuff I needed to get removed from the Australian National Library's archive is exactly the sort of stuff that shouldn't be in there, directly against the statutory rights and mission, and the sort of thing that could be pointed to when you wanted to defund the project. Because some twit thought meaningful Australian published materials meant anything under a .au top level domain, all the dross hoovered up by IA including all the stuff since removed because it is in nobodies interest or causing harm. And it was a pain in the arse.
There is an overlap in the two. Copyright can be used as a defense against folk who believe, "Everything on the internet not behind authentication is commons". Often these folks point to books, magazines, etc in reference to their argument, which is certainly bad faith, but that's why copyright arguments come up.
Copyright is a mechanism used to protect privacy in these situations. When you don't have copyright, you are stuck needing a court to protect your privacy. Copyright is also what is required to prove in order to get stuff taken down by IA when the content is not obviously illegal or personally identifiable information (or at least it was when I last needed to deal with it).
Information on a public website is public until it is taken down or the information changed. The Internet Archive removes an individuals control over when the information remains public. This is privacy. We might be caught naked, and we can't unsee what has been seen, but it is a basic human instinct to draw the curtains and contain further damage. Perfectly innocent individuals suffer because the IA rules are designed around edge cases where public figures try to hide misdeeds.
If you print a magazine you also don't get to recall all copies if you change your mind about something. Giving individuals this kind of control over other's ability to freely share information is dangerous because it is easily abused to hide information that is in the public's interest and that is not an edge case at all. Making a decision to publish something on the public web is hardly analogous to being caught naked even if you may come to regret either.
If anything, the IA should be more reluctant to remove information without a court decision.
> The Internet Archive removes an individuals control over when the information remains public.
And that's a good thing in the vast majority of cases. Unless we're talking about sensitive information that was published without the consent of the person in question, all public information should remain public forever.
In my experience, it is the vast minority of cases. Most of the content of the IA is not in the public interest, now or in the future. It is crap. It is noise. It is the contents of the Internet at a point in time. Actual information is the wheat in the chaff, and why you need search engines to find it. We know this, because of the Usenet archives that are intermittently available. Almost completely useless apart from people having a giggle at how the Internet used to be, a quick browse and search for naughty words. And a few gems in the mountain of noise, in such dire need of curation people hardly know it exists and barely justifiable enough for libraries to keep it alive.
Some people discover much too late that there are some things they wish they could take back. Often before trying to get a better job or when trying to escape an abuser. Given the ramping up of attacks (legal and otherwise) on queer people, this is going to be a huge issue over the next decade or so.
If you’re relying on an honor system .txt file to preserve your privacy I think that says enough already. It’s not like they’re infiltrating password-protected links or private iCloud accounts.
In truth, regulation can only reliably protect your privacy from well-behaved actors whose actions/violations are observable.
If you've taken no self-help measures to limit access, then bad actors, unobservable to you and regulators, will still be doing whatever they would like to do and can get away with.
But you may be lulled into a false sense of security by the false promise of a 'solution' via regulations.
As I've now learned, you used to work for the Internet Archive. You should probably start your statements with that.
> If you've taken no self-help measures to limit access...
robots.txt was a nice self-help measure.
> ... then bad actors, unobservable to you and regulators, will still be doing whatever they would like to do and can get away with.
Regulators still have to follow regulations. You are right that I can't stop someone from creating offline archives - but they're not really who I am worried about. Nor am I worried about the small servers that keep copies of documents during transmission, unless of course they're doing so for criminal reasons.
Should you start all of your statements with a list of every project you've ever worked on? Show me an example of how it's done before you make such an exceptional request of me.
For any who are more curious about a commenters' background than their current words, my profile already links to copious resources on my work history, & writings, beyond what's typical of contributors here.
Yes, if I was commenting on a project or company that I had self-interest in (eg: reputational, monetary, etc) then I would add a disclaimer like everyone else here does.