Hacker News new | ask | show | jobs
by stillkicking 2905 days ago
In many ways he's understating the problem. At least there used to be file formats to support, and files to lose. But how much of the data you interact with daily is even directly accessible to you? How much of it can you access when you're offline?

Take Slack. You can install it locally, but that doesn't matter, because it won't start without an internet connection, and its local storage is completely opaque. Compare this to a mail client, where everything is stored locally, indexed and searchable offline, and can be exported into a universal albeit messy MIME format, as well as imported into other accounts.

Of course this is necessary for the business model, the Slack free-plan event horizon wouldn't be effective if it only applied to which new data you can sync down... and if users only discovered this when e.g. moving to a new computer, they would quickly start to wonder why they can't just transfer the data they already have themselves.

With Google Drive, your 'docs' are just placeholders pointing to the cloud. You can export them individually to Word or print them to PDF, but it's a manual process, which you will likely only think of when it's too late.

For an example of how this can really matter: an acquaintance is embroiled in a legal dispute with an ex-employer. Thanks to Apple's sane implementation of IMAP Mail on iOS, she still has access to all her company communications there to use as evidence. Unlike on her PC, where she was just using webmail the entire time and has nothing.

The incentive in the cloud age is to create dysfunctional products that provide an illusion of permanence instead of the reality of tangibility. I expect this is only going to become a bigger problem over time.

6 comments

These are all features for companies - no liability of haunting, written word.

I'm desperately trying to connect to "modern" chat services via things like Pidgin to have logs - while on some level, remembering everything exactly the way it was is unnatural, links and knowledge is important to be possible to be kept.

I think people are starting to realize that actually having a copy is important. Hosting things for yourself, taking care of backups, etc, are painful, hard problems, but they worth it in the long run.

One more thing: many digital media is significantly more ephemeral, than a lot of us realizes. Out of countless CD from the past 20 years, I can only still read a few of them (the ones written with a 1x Plextor SCSI drive are all still fine). If you truly value an photo, make a proper print, archival grade tint, archival grade paper.

> If you truly value an photo, make a proper print, archival grade tint, archival grade paper.

Or you could parity-pad your backups so even 20% corruption is still recoverable and periodically renew them. Several orders of magnitude cheaper.

Then there is a house fire and your computers are destroyed and you're in a coma. Would your spouse know how to retrieve your wedding photo from your parity-padded offsite backups? Does anyone in your life know where they are, and the access information, and the encryption key/password, and how they're organized, and how to find specific content within them?
> You can export them individually to Word or print them to PDF

Actually it's really really easy to do in bulk: https://takeout.google.com/settings/takeout?pli=1

Not invalidating the rest of what you said.

I think the rest of what they said matters, though. Its not "preservable by default". If you need access to that data in any situation where your connectivity to Google or that account is severed, you're shit out of luck. The best case example is if you don't have internet. The bad case is really that "legal dispute" argument, where you're dealing with a bad actor who has power over you and your Google account. The worst case: Google themselves severs your access.
I'm trying to not phrase this in a dick way, but what did you think I meant by "Not invalidating the rest of what you said"?
Very true, but also similar to not having a good system of backups, which is hardly uncommon for many users (present company excluded I'm sure).
Google's data take-out has proven fairly robust.
> The incentive in the cloud age is to create dysfunctional products that provide an illusion of permanence instead of the reality of tangibility. I expect this is only going to become a bigger problem over time.

Isn't this what SaaS is all about?

Well more specifically it's one of the longstanding arguments against SaaS as a concept.

It's a tired tune to say that RMS was right, but he was. Letting people be controlled by software is not a good idea for pragmatic reasons that go beyond mere morality.

I think it is up to everyone to weight the pros and cons of using SaaS. Many entreprises would not even exist if they had to build and maintain their stack themselves.

Also, I would not equivocate not controlling your software with being controlled by it. In some cases maybe, but definitely not all.

There are shades of grey in SaaS implementations, though. That the product is operated through a third-party server doesn't always imply it needs to be only available when you're on-line, and doesn't imply that your data needs to be taken hostage. A big part of the problem here is what 'stillkicking mentioned - doing away with files. Instead of data files, you have "documents" (meaning specific to a service). You can no longer open them independently, copy to local storage, or e-mail it to a friend - you're restricted to "sharing" them within the platform. This is not a technical necessity. It's just driven by business model.
This is why I believe in CRDTs: https://news.ycombinator.com/item?id=17221221

With the right spec, you can have Google Docs style real-time collaboration or Slack style chat while allowing users to own and archive their data, remaining resilient under arbitrary network conditions and topologies, and retaining full edit history with the option to link to any previous revisions or even individual changes (with context). The system could be built on top of databases, files, or both. Your "*.crdt" documents could live in your Dropbox and work seamlessly with any software that understands the spec.

It'll take some work to get there, though. And of course, it'll never be as resource-efficient as a centralized architecture. But today’s devices can more than handle the strain.

> For an example of how this can really matter: an acquaintance is embroiled in a legal dispute with an ex-employer. Thanks to Apple's sane implementation of IMAP Mail on iOS, she still has access to all her company communications there to use as evidence. Unlike on her PC, where she was just using webmail the entire time and has nothing.

I agree this is an issue, but just thought it was worth pointing out that (cloud + offline) is better than just offline.

I'm a bit of a digital packrat. Despite carefully preserving and porting my data from system to system, I lost years of email archives in the aughts because Outlook Express would prompt me to archive old mail, but I didn't realize it was over-writing the old archive until I went looking for a message and couldn't find it (when it was too late). The lost email covered all but the last few months of my undergraduate degree.

Cloud + offline is best. Just offline is worse, but just cloud is the worst, and that's what we're being offered 99% of the time. E-mail is a special case because the protocol and traditions around it predate modern commercialized Internet. But e.g. all the other communication services regular people use nowadays are cloud-only.
This is needless pandering and fear-mongering, akin to the hysteria that inevitably comes with the evolution of technology. For example, the telephone will be the end of society trope that was popular at the turn of the century.

Anyway.

> With Google Drive, your 'docs' are just placeholders pointing to the cloud. You can export them individually to Word or print them to PDF, but it's a manual process, which you will likely only think of when it's too late.

This has been trod to death over the years. Let's do some find-and-replace.

With a Hard Drive, your 'docs' are just placeholders pointing to the platter. You can copy them individually to diskette or print them to hardcopy, but it's a manual process, which you will likely only think of when it's too late.