Hacker News new | ask | show | jobs
by zafiro17 1956 days ago
On behalf of Usenet users everywhere, this is excellent news, and I wouldn't be surprised if Usenetters begin stuffing other useful newsgroups with crap in order to removed from Google as well.

Google's assimilation of Usenet content had promise at first but quickly turned into a dystopia and the general consensus on Usenet is that Google has been a disaster for Usenet.

2 comments

Google had the most complete archive of comp.lang.c (and the rest of Usenet). That archive is now inaccessible.

Users posting to comp.lang.c through Google Groups were a problem -- especially with the recent bug that caused GG posts to comp.lang.c++ to have the "++" quietly dropped.

If Google made its archive of all Usenet newsgroups available, I'd be fine with them dropping the posting interface. (And some users actually managed to post to comp.lang.c through Google Groups without breaking things.)

> "... complete archive of ... Usenet"

<sotto voce> Usenet postings were never meant to be permanent. I was there in its heyday and nobody expected their postings to live beyond the spool expiration lifetime.

Yup. I've met a number of old farts like me who were not thrilled when what we assumed was transient had been archived. I'm not a big fan of the opt-out model of archiving where archivers just assume that they have implied consent to grab anything and everyone from everyone who hasn't explicitly said no.
The problem is that opt-in archival is effectively no archival.

If you looked at how data is lost online, you'd find out that by far the most obvious and immediate cause of data loss was third-party intermediaries shutting down old sites. For example, when Yahoo! decided to burn down Geocities - and shittons of early Internet history - on the basis of saving some hosting and storage costs. A second cause is neglect; say, you move your blog to a new platform, but you don't bother redirecting the old URLs, so now you've just carved a hundred or so new dead links into other people's content. In both of those cases, data wasn't being deliberately removed because it needed to go - it just happened by accident.

Furthermore, we don't usually know when these accidents happen. Yes, occasionally some intermediary announces it in advance, but often times linked sites just die. This is a phenomenon known as link rot, and it can happen for a whole host of accidental reasons. Try going to a 10 year old news article and clicking on any of the links, or going to a decades-old forum thread and looking at any of the images. Count how many of them still work. It's depressingly low.

Now, try to imagine getting consent to archive in advance from people who do not know or care about the problems I've just mentioned. You won't get very far - not because the archiving is harmful to them but because most people do not understand the problem. People don't backup their own shit until after they've already lost heaps of data. Even in situations like the Geocities shutdown, the logistics of actually asking for permission to archive on top of running a bunch of scrapers to actually do the archival is... complicated. You could do it, but the vast majority of data would go unarchived purely due to an inability to locate the owner of the data.

Dunno. History matters and historians would probably strongly disagree. Some things are more important than you think.

Too much gets lost too easily.

Not everything is deserving of being treated as historical artifacts worthy of retention. A friend has lamented that when he was in college in the late 80s, he used USENET as a way to connect with people and work through some pretty heavy emotional challenges. Retaining his personal struggles is hardly "history" - if anything, the historical value of retaining the struggles of one inconsequential person is far lower than the direct impact a google search has on his life today. So as you said, some things aren't as important as you think - in his case, privacy and respect are a bit more important than a historical record of a teenager seeking people to talk to. Every minute detail of history isn't as important as you think.
You'd be surprised from what is considered history. Nowadays, insights into the lives and emotions of ordinary people are considered invaluable tool in recreating the past: the context in which political, economical, and cultural developments happened. In fact, the spontaneous nature of certain artifacts makes them more valuable, because what we would call "official history" is always editorialized and subject to the influence of only few people and not produced by the entire society it originates from.

It is something that I've thought about - the contradiction between privacy and the need to communicate yourself to the generations to come and the broadcast into the void it requires. If your friend is okay with his privacy in the archives, he might be glad to know that in hundred years, there will be an AI whose PHD will be on the emotional significance of new technologies in the lives of early adopters of the Internet, the case of user John Smith 1988.

Once you publicly post something, or even post something privately to someone else, it’s not entirely yours anymore, in an important sense it’s theirs.

Yes Usenet used to gave a spool lifetime, but that was clearly variable and there were no guarantees about it. Anybody could set up a mirror and frequently did. Also anyone could copy out content to other media and there was no expectation that they couldn’t do that.

We are all responsible for what we post to other people and in some cases also the effect it has on them. Consider bullying, abuse, criminal conspiracy, the record of receiving a message belongs to the receiver not the sender.

Every single minute detail is exactly equally important, because you, nor I, nor anyone, has any idea what will be important or why it will be important.
To a sociologist

connect with people and work through some pretty heavy emotional challenges

is a goodmine and insight into history.

I think, judging what is going to be of value for people in the future is a much harder problem than you make it out to be.
And yet Eric Schmidt says (2010)* we should change our name... Because everything is always kept

https://m.huffingtonpost.com.au/entry/google-ceo-eric-schmid...

* It's ironic I'm referring to a 10yr old piece to refer to the danger of continuous archives.

In 2001, it was Scott McNealy :

https://nerocam.com/DrFun/Dave/Dr-Fun/df200108/df20010808.jp...

If you are not familiar with Dr. Fun: Don Knuth Finally Sells Out https://nerocam.com/DrFun/Dave/Dr-Fun/df200002/df20000210.jp...

Not really, that information gradually gets bit-rot and evaporates away to nothingness as website after website gets old and vanishes.

In the early days of the Internet, I often used to Google/AltaVista/Jeeves my own name. There used to be quite a few hits. Over the years, those old hits have just faded away.

>Dunno. History matters and historians would probably strongly disagree.

People don't live or structure their lives to satisfy historians...

People also don't live or structure their lives to satisfy those who had a written conversation and want the copies burned.
No. I'm struggling to think of a reply better than "Obviously, not."

I think your statement was already implicit in what I was saying. We should move on to the next bit where I say "But..." and you counter it.

Even if this were true, it would be better if the archives were not available for some decades after the original posting. I’m OK with my content being analyzed by historians, but the definition of “historian” presumes some professional detachment that is not available short years after the post was made.
I think even leaving aside historians - the idea of making everything posting transient by default would have robbed us of a lot of stored knowledge. I'm feeling quite bitter about the rise of Slack and Discord over mailing lists and Usenet.

The ease in which one can find clues and solutions to decades old technical questions relies on everything being stored.

And more than just technical issues - I'm no historian but I've wasted endless hours following fascinating discussions from the past that suddenly become relevent because of recent events or unforeseen connections. I love how much is preserved by accident. It makes me slightly sad to think that it might be otherwise and that others would wish it otherwise.

> but the definition of “historian” presumes some professional detachment

The idea of everything you've ever posted becoming part of a giant "digital permanent record" used by data brokers, advertisers, credit bureaus, trolls, nosy people, etc. is somewhat unappealing.

It has to be available for some period of time after posting. Where would you draw the line? Hide them after a few weeks? Months? Years?
Before it becomes history:

Phase 1: hall monitor trolls in coordination with HR use it to keep people in line

Phase 2: the archive is rediscovered as truth about what the world used to be like and suppressed

Phase 3: there is no archive and never was

Phase 4: the archive is rediscovered, hidden and the esoteric knowledge used to start a cult

Phase 5: the cult eventually prevails in society, becomes a religion and suppresses inconvenient parts of the archive

Phase 6: well history repeats itself so why go on?

It's not much different than a recording of a conversation. I wouldn't implicitly expect the recording to be destroyed, or expect the law to require destruction when the recording was made on equipment I don't control.
I don't think that these are equivalent. This issue is similar to saying that every conversation will be recorded by default, unless you opt out of a recording beforehand.
This would be why pretty much every Usenet post I made from 1996 (or so) onwards had an X-No-Archive header. How many actually honoured it, I cannot say, however.
Yeah, Dejanews was the harbinger of the archiving apocalypse. I added X-No-Archive: Yes to have dejanews throw away my posts the instant I learned about it, and then later on when Google bought dejanews and gave us all a one-time "opt out or be archived forever" chance, I was able to purge all the rest.
When do you consider that heyday to be? I showed up in 1995 and my expectation was always that anything posted there would be permanently on the internet.

So I deployed Kester’s Rule of the Internet: Don’t put things on the internet that you don’t want to be on the internet. That way there won’t be anything you put on the internet that you didn’t want to be there.

It took years but all of my old Usenet posts eventually were removed. The people running archives tended to be real jerks about it - I’m pretty sure a couple of them got off on telling me no with long hand typed explanations, but in the end most of them were running archives on their employer’s equipment and when they switched jobs, got let go, or retired they couldn’t take it with them and nobody else wanted to maintain them so eventually the last copy I could find fell.
The jerk was someone publishing something publicly, voluntarily, then perversely trying to make anyone who might have saved their copies of the thing you gave them, delete them.

I'm amazed anyone gave you the time if day. I can't imagine I'd have even answered. I certainly never signed any kind of copyright agreement when I started posting on BBS's or newsgroups.

Somewhere on one or more old hard drives somewhere I may have a few years of whatever groups I was into at the time, and I tell you now, on this new similar public forum, I will not bother digging out and scanning all my old drives to find anything to remove it, and I will not promise never to add them to any public archive.

Different situation but same overall attitude. I'd send a polite one-line request asking to remove a specific message and I'd get back three or four hand-typed pages where the person was gloating that I was powerless to make them do anything, basically the the same thing I see revenge porn victims going through - though nobody ever tried to charge me to remove a post and nobody went the route of re-posting my messages to spite me.
Maybe so, but those of us who try to restore or emulate old stuff from those long ago times actually have to rely on contemporary information from Usenet because the old documentation no longer exists.
> nobody expected their postings to live beyond the spool expiration lifetime.

I know that a number of posters in groups I used to frequent added the X-No-Archive: yes header to their posts in the early 2000s. The Google and Deja news archive before it did honor that header apparently.

This is hilarious because Google is probably the largest user of C++ and has the most people on the C++ standards committees .
This is about C, not C++. Though I haven’t been there in years, I chuckle to think what would have happened if you made this mistake on that news group.
Pretty sure OP is referring to this part of the comment they were replying to:

> Users posting to comp.lang.c through Google Groups were a problem -- especially with the recent bug that caused GG posts to comp.lang.c++ to have the "++" quietly dropped.

I would have thought that that had already happened to every Usenet group by now, or even 20 years ago.