Interesting in a technological sense, but what problem it solves isn’t obvious to me.
It lets me granularly authorize first party access what data I have in my pod, but there can’t be any technical guarantees with regards to illegitimate sharing or otherwise copying (many might at least cache, for instance) – nor about what is collected and shared outside this system.
I keep seeing data-hubs and identity-providers touting themselves as solutions to the web's privacy issues, but I don't see how they actually solve anything.
It seems like an attempted technical solution to a social problem to me.
The real problem with data based services (ads, Google search, etc) is really that a bunch of data is collected opaquely, unethically, and in some cases illegally. The whole system including data brokers and real time bidding is out of control.
It's partially related the webmail-vs-IMAP problem: direct access and control over your data.
An example: There is a pending suit that will ultimately be settled with an insurance company by the courts. Crucial to the case is data collected by a mobile app that helps establish some relevant facts. The incident in question was >1 year ago, and we're going to move forward with the case this week (originally planned for last spring but put off due to COVID). Yesterday, I logged in to the site associated with the app, and it threw up a screen that cannot be dismissed, in the style of "please take care of <these issues with your account> before you can proceed". This is an account which is nowadays dormant, and there is in fact no way to take care of these issues. I dug out my old phone in an attempt to access the records in-app and take screenshots for the benefit of the court. The app itself had had an update released, and the records are now inaccessible, because the old version of the app is treated as an obsolete client. Fortunately, I'd already earlier exported all the data I could readily get my hands on—so the only thing I'm giving up are those screenshots that I determined in a last-minute decision would be helpful as supplemental resources—but this could have been a problem for someone who's never heard the phrase "move fast and break things" and who took it on faith that all this stuff wouldn't just disappear underneath their nose for seemingly no good reason.
If we transition to a world where apps are always writing to (and pulling from) data stores that are under your control, then this would be a total non-issue, even for people less paranoid/guarded than I was. The truth is that there are social hurdles, but there are technological hurdles, too, and dealing with the technological part is a precondition to society being able to be effective in doing its part. People can't solve problems with solutions that don't exist.
In my (personal) view, it's the technical part of a solution that definitely also needs to have a social/legislative component. It cannot prevent parties from illegitimate sharing of my data, but it does give them the option to hand over control to me. There are lots of companies that currently hold data on us but for whom that data is not their primary competency, and they only need a small nudge (like GDPR) to make having the customer responsible for that data an attractive proposition.
You might be right, but I think it's disingenuous to market it as though this "solves privacy". Worst case, people are lulled into a false sense of security.
Data-storage + authorization doesn't solve any (new) technical privacy-issues; this is "data protection" rather than "data privacy" in my book.
While I recognize the value of W3C LDP and SOLID, I also fail to see anything in SOLID that prevents B from sharing A's now pod-siloed information.
Does it prevent screenshots and OCR?
So it's in standard record structs and that makes it harder for the bad guys?
Who moderates mean memes with my face on them?
It is my hope that future Linked Data spec tutorials model something benign like shapes or cells instead of people: so that we can still see the value.
Laws still exist against things like perjury, even though the existence of the law is not a technical means in itself able to prevent perjury. Note that one of the comments upthread specifically mentioned legislation. The current notion that many people in the tech world have, which roughly states that what determines whether something is kosher is whether it's technically possible to accomplish, is something that needs to change, instead of things just staying a permanent Wild West forever.
There's also an old phrase that putting locks on your doors doesn't actually stop a determined attacker, but that it's okay because they're not meant to—that they're meant to "keep honest people honest". It's a principle that applies here.
No, there are few to no actual privacy improvements over centralized systems.
Perhaps even functional regression: what, are you going to run a hash blocklist across all nodes? Like spamhaus? Is there logging or user accounting? Is anything chain of custody admissable, or are we actually talking about privacy and liberty here?
Is everything just marked, "not for unlimited distribution"? And we dwpend upon there not being bad actors?
Real costs are very different with just friendly early adopters.
Cryptographically signing posts (with LD-Signatures) may help with integrity, but that can be done with centralized systems and does nothing to help with confidentiality.
What about availability?
Is it a trivially-DOS'able system?
Maybe we need a quantum leap in technology first? Operating on data that can't be immutably copied (i.e. quantum state) opens up interesting possibilities in privacy space.
Again, the issue is that once you did share (= sent to) the data once with anyone, you don't have any technical control over what might happen to it (see the pirate bay as an example).
Quantum anti-tampering isn't going to help here, where it's your interlocutor that is the one that can't be trusted.
Depends on what the data is and if the aggregators can understand it.
You could have the model where the silo produces encrypted blobs and the end client can read it. (What's stored and connected is nothing but encrypted blobs)
> “This technology could unlock an enormous amount of innovation,” potentially becoming a new platform as the iPhone was for smartphone apps, he said.
"Platforms", aka "Minitels 2.0"¤ are what is wrong today with the Web specifically, and the state of today's infocom technologies in general.
The whole point of Tim's "pods", is that just like the WWW, they aren't going to be just another private, centralized platform. Or has this word diffused to the point of losing all meaning?
> The whole point of Tim's "pods", is that just like the WWW, they aren't going to be just another private, centralized platform.
Difference is that pods operate on top of open protocols for storing and accessing data. This means that you can stop hosting your pod and move data to another hoster of pods.
TBL and W3C could enable competition to de facto monopolies such as FB by providing web standards that enable competitors to overcome FB's inherent walled garden first mover silo advantage.
How?
Extend HTML to include a Like button and a Share button, and implement a new standard that defines an open access comment platform.
I'm not suggesting W3C should set up servers to compete with service providers. Rather, it could define protocols for those capabilities as web standards which are designed to enable arbitrary 3rd party implementers to federate interactions. That way, service providers could attract niche social groups, whilst pooling interactions, thereby overcoming the dilemma of all being too small to compete with FB.
And, ActivityPub is already a W3C published standard...Of course, having an existing standard doesn't mean that the Facebooks's of the world will choose to adopt it.
They don't really have to add anything to HTML. I just put it that way to express the idea of the "Like" button being a page element available anywhere on the web, rather than only inside a walled garden.
What is really required is a database protocol for tracking Likes, or "a client/server API for creating, updating, and deleting content, as well as a federated server-to-server API for delivering notifications and content."
But they have that! I didn't know about ActivityPub until I read mxuribe's comment above. That's a good start.
As mxuribe says, "having an existing standard doesn't mean that the Facebooks's of the world will choose to adopt it." I would expect FB to resist it. But if the backlash and dissatisfaction with FB grows, a protocol like ActivityPub is a necessary enabler for something new to happen. By allowing multiple providers to share content in a federated model, the protocol could grow organically without requiring one big new player to migrate all the FB users to a new monopoly.
Once it starts to happen, FB customers could be bridged into the new federated universe with translators that mirror content from FB into the new ecosystem.
Right. It would take some act of compulsion to make them comply, or some clever hack to automate scraping and mirroring. While most of the customers accept FB's legitimacy, they still rule, and the effort to hack or legislate might not obviously pay dividends.
Will this last forever, or go the way of MySpace?
FB appears to be more robust than MySpace because of the inertial qualities of its comprehensive membership, but that may be changing.
2020 is the year "surveillance capitalism" entered the popular lexicon, and 2021 has kicked off with an ugliness that could well be transformative. If there is a revolution, new tools, infrastructure, and rules are required to establish new social media platform that avoids the pitfalls of the old.
It is very interesting that, with the exception of people like Berners-Lee, computer scientists around the world have decided that the problems of social networks should be addressed only within the realm of private companies. I see little to no coordinated activity targeted at open social networks for commenting, liking, and sharing. Similar pattern on open and distributed protocols for searching and sharing data. It seems to me as a failure of academia in this important area of computing.
It is important to remember that distributed protocols for social interaction is not something new that researchers had not considered before. Email is the prime example of open, distributed protocol that still is very successful. But many researchers have stoped to consider such open protocols and jumped in the walled garden bandwagon.
This is a bit of a tangent, but I think there's a market for a "self-tracker" data hub of sorts. My half-baked idea is that it'd run locally and ingest my activity of all sorts -- privately, securely, individually -- to help inform my personal knowledge base. Along the lines of Readwise but broader and deeper, and with analytics....
1000% yes. Local logging of activities from lots of applications and devices. Everything gets added to a full-text search index, and the data only lives locally unless otherwise specified.
I don't know. I'd venture a guess of "no" on pluggable apps. It does have a plugin-like architecture with e.g. adapters for Twitter. I've never used it, only seen it in action. It's an activity and data tracker written to serve Brad Fitzpatrick's personal needs. When he was still at Google, he was paying someone to hack on it and maintain it. It has some rudimentary "analytics" insofar as there are various ways to present the data stored in it.
… store all my stuff forever, not worrying about deleting, or losing stuff.
… save stuff easily, and without categorizing it or choosing a location whenever I save it. I just want a data dumptruck that I can throw stuff at whenever.
… never lose anything because nothing can be overwritten (all blobs are content-addressable), and there’s no delete support. (optional garbage collection coming later)
be able to search for anything I once stored.
be able to browse and visualize stuff I’ve stored.
… not always be forced into a POSIX-y filesystem model. That involves thinking of where to put stuff, and most the time I don’t even want filenames. If I take a bunch of photos, those don’t have filenames (or not good ones, and not unique). They just exist. They don’t need a directory or a name. Likewise with blog posts, comments, likes, bookmarks, etc. They’re just objects.
… have a POSIX-y filesystem when I want one. And it should all be logically available on my tiny laptop’s SSD disk, even if my laptop’s disk is miniscule compared to my entire repo. That is, there should actually be a caching virtual filesystem, not a daemon running rsync in the background. If I have to have a complete copy of my data locally, or I have to “choose which folders” to sync, that’s broken.
… be able to synthesize POSIX-y filesystems from search queries over my higher-level objects. e.g. a “recent” directory of recent photos from my Android phone (this all works already in 0.1)
Not write another CMS system, ever. Perkeep should be able to store and model any type of content, so it can just be a backend for other apps.
… have backups of all my social network content I created daily on other people’s servers, to protect myself if my account is hijacked, the company goes evil, changes ownership, or goes out of business.
… have both a web UI and command-line tools, as well as a FUSE filesystem.
… be in control of my data, but also still be able to utilize big companies’ infrastructure cloud products if desired.
… be able to share content with both technical and non-technical friends.
Most of this works as of the 0.1 release, and the rest and more is in progress."
IOW, exactly what I was looking for. Sometimes I'm reminded why I still love the internet.
That's not exactly true. Solid is a refinement of HTTP, and HTTP is and always has been closer to "messages and ports" than files. <https://news.ycombinator.com/item?id=17897325> (Not a critique of whether your tldr is an accurate summary of the forum post or not, just that the criticism itself doesn't hold.) Having said that, moving to the imperfect abstraction of a dumb filesystem alone would be beneficial (better than what we're doing now, at least), even if that's all it ever amounted to, and we were stuck with that bad abstraction forever.
We should simply outlaw most privacy-invasive behavior. People will still demand news, social-media etc, but the payment will be different. Technology cannot and must not solve everything.
1. Ad-tracking: we have lived decades without advertises being able to track responses, viewership etc. Ads will still be valuable for business if we remove those options.
I don’t think privacy is tenable. I think we could eventually Physically engineer ourselves to be more cooperative and less driven to harm others. This is achievable, but privacy is not stable, is unnatural, and because it is so hard to come by, striving for it creates in my opinion needless scarcity.
Trying to avoid harming others is also not tenable, but a policy of avoiding harm is capable of being followed with soft penalties, and I think neuroticism can be avoided. I think striving for privacy and it’s preservation is inherently neurotic, though it can be short term successful policy in the presence of others who would harm, exploit, or subjugate us.
I'm glad TBL is working on this stuff, but at this point I can't help but feel like this is doomed to a very slow uptake, like how Semantic Web and IPv6 are still emerging technologies. I am reminded of esr on Plan 9:
> Unix creaks and clanks and has obvious rust spots, but it gets the job done well enough to hold its position. There is a lesson here for ambitious system architects: The most dangerous enemy of a better solution is an existing codebase that is just good enough.
It ultimately depends on the governments. They have been quite proactive in mandating the end of some technologies in the fields of TV and light bulbs. Not sure why they are dragging their feets so much with IPv6, now that it has finally been finalized in 2017, and even Europe ran out of IPv4 addresses in 2020 ?
@wcerfgba That is such a great quote about unix and an existing codebase being good enough! Do you know the source? I'd love to refer to that in my presentations at work, etc. Thanks!
How do you grant a company access to your data but prevent them from storing it? And how does it apply to data a company generates about me? For example, if I listen to songs on Spotify, are they supposed to somehow not store it, but still give me recommendations?
If you don't want them to know what it is, encrypt it. Even if they store it, it's not much use.
If you don't want them to keep it, find a way to invalidate it. (This would be for where the read key is time sensitive.. not sure how to make that work)
I think the privacy angle is misguided. Most people don't really care about it. Even moreso for stuff like what songs did I listen to on Spotify.
The better angle is that we're becoming digital serfs. Google decided that they didn't want Google Music to exist anymore and poof went my listening history and playlists. Any service that I use today can do the same thing. If that data were stored somewhere I had access to I could have imported it in to Spotify.
This is an area I think Amazon or CloudFlare could step into. Sell consumers a NAS type box that keeps their data local. Sell companies on Lambda/Workers @ Home and have their applications run on that NAS.
> If that data were stored somewhere I had access to I could have imported it in to Spotify.
At the moment we've been pushing services in the wrong direction to create their own schemas. However, we may win back control with standards on this one.
But yes, the idea is that you are able to remove the control they have over the data you've produced. It's such a terrible arguement to claim they own the data. (Also, why do they need to control that other than to try to prevent you from leaving)
People do care about privacy, it's just that they have Snapchat-style privacy concerns, not the hypothetical ones that technologists tend to talk too much about. You're right that people don't care about YouTube having access to their stuff; they care about people having it—people like Regina, or their manager (or Regina, their manager). The whole "digital serfdom" concept is as abstract of a concern (and in the minds of many, as irrelevant) as the classic surveillance capitalism arguments that you're putting down, even if the digital serfdom concept is accurate. People just don't care about anything that isn't an immediate concern.
If you are asking how it would be technically feasible, there are essentially two ways at the top of my head.
1. End to end encryption. They store your data, but without your password, its encrypted in the db and useless.
2. You pass all your data in every request, like a sqlite file or something.
To me, this is the major flaw with Solid: Why use a third party's service at all? Why should your apps and your data be on your own server? Sandstorm and Cloudron already do this, and make it user-friendly to install, remove, and share web apps with people from a private space. Furthermore, Sandstorm also assumes apps are malicious, so it is relatively safe to install proprietary apps on-device and still prevent data exfiltration.
There are very few types of apps which truly need a third party server to work.
>if I listen to songs on Spotify, are they supposed to somehow not store it, but still give me recommendations?
This is perfectly possible.
In your example, Spotify could store the data they needed for their recommendation algorithm in aggregate form so that any link to a person was destroyed and not reversible.
And then make recommendations by running that algorithm on your locally/privately stored data, with no loss of functionality.
As such, a recommendation algorithm does not technically benefit from storing your personal data, at all.
Yesterday I got pissed when I tried to download a podcast episode. It is available on Apple, Google and Spotify, but these platforms won't let you download a simple mp3.
Ended up having to pay for the network traffic.
Freedom is the better technology, and Solid claims to offer freedom but if you look closely it doesn't.
In what world does a specification designed by comity, describing functionality that existed for at least 15 years, and that furiously lobbies the government for its forced adoption, have anything to do with freedom?
How does ActivePub help me compete with facebook, How? Why can TikTok get popular without it, Why?
Maybe, companies should be forced to offer me a RSS feed of mp3s. Maybe not mp3 but some open format, and we should force chip makers to add special instructions to their chips for optimal playing speed.
I run a podcast, and built my own distribution to spotify, itunes, etc. Those platforms just consume an rss feed and the mp3s are hosted on my s3 bucket
Here's what the rss feeds look like
Https://codechefs.dev/rss.xml
If you do some google searches on "podcast name rss" I'm sure a public feed will pop up
But yeah I'm not sure why these platforms don't let you see the rss feeds though
We need apps built on this technology. It will go a long way to making it succeed and working out the nuances. Interesting tech is interesting. Something that's generally useful or solves a normal problem... that's something people will pick up.
Something I rarely see brought up in these articles is the difficulty of defining "data" and who would own it. If I buy something on Amazon to be shipped to me from a third party seller, which part of that transaction is data that belongs to me? To Amazon? The the third party?
The article ventures a short list: "websites visited, credit card purchases, workout routines, music streamed", but I don't see how that could ever be turned into a coherent definition. A "credit card purchase" likely involves a dozen distinct parties with their own individual role and view of the event.
The Solid Project uses OpenID authentication (which is built on top of the OAuth 2.0 protocol). The claim is that Solid uses OpenID to uniquely identify every single shared object in a Pod. OpenID is then responsible for the access control to Pod's resources. It is an interesting and intriguing enough of an idea which makes it worth investigating (even for curiosity-only purposes).
I adore this project, and wish for it to succeed, but how do we incentivize or force companies to accommodate pods? Legislature? About the only way I can think of.
> “No one will argue with the direction,” said Liam Broza, a founder of LifeScope, an open-source data project. “He’s on the right side of history. But is what he’s doing really going to work?”
While I totally support Tim's project, history will decide what is "on the right side of history". Unless he's from the future?
Well.. obviously it's speculation. Specifically a form of speculation describing Liam's belief in this technology.
With that said, depending on what specifically Liam had in mind with that quote, i don't think it's far off. Tim's technological choice might be right or wrong, but it's difficult to argue that people should be able to own more of their data than they do now. Is there some pro-Google argument that would argue they're the ideal hosts for your data?
I guess as long as it isn't a form of "we are the good guys = we cannot lose (at least in the long term)"…
I guess that I found the whiplash with the next phrase somewhat funny.
As far as for Google, they're certainly very competent at their job of gathering (and using) the world's information. Which makes them both tempting to use, and also extremely dangerous. Also, remember their old motto ? I wonder how many of today's googlers still believe that they're the "good guys" ?
Its a great project and it would solve a lot of problems we face today for sure.
But i fear it will suffer with a problem of adoption.
People had to learn HTML and HTTP back in the day, because it was the thing that would turn possible to transfer information through the wire with a platform called browser.
It was the same with the Windows API, VB, Delphi or Android and the iPhone is today.
People will learn that thing not because it will 'save the world', sure some will, but for more pragmatic reasons. So you also have to offer those pragmatic reasons to people, because those reasons are also important after all.
I know TBL was more or less on the "hippie" side of the web standards and it was very important to the web's core and foundation on the right track.
But i was not because of the HTML standard was great as a piece of technology, but the energy and the people that formed around it made it happen through the patient iteration over browsers, until browsers became a thing no one could avoid.
I'm saying this as somebody working more or less on the same problem, but who have taken a different approach..
The problem is hard because the state-of-the-art now is very sophisticated. You will have to compete with browsers and app platforms for mindshare, and i think you only can do it if you propose a new platform where people understand it as a better approach.
And i must say, the web alone as it is, is a broken foundation to lay out this sort of thing, for a lot of reasons.
So we need a new sort of browser, one that's so different that you actually wont even be able to call it a browser anymore.
This is what i'm trying to do. Trying to solve the same sort of problems, but with a different take than Solid.
But i must say its pretty hard, because you also have to offer, at least as a starting point, what browsers and application platforms already offer to developer. Along with this, there's a need for a incentive on the part of the user, the ultimate consumer of the thing. And this is also a hard problem, because you will need to offer something people want and dont have already..
I think i got this, but only time will tell. And even if the thing is somehow "right", even than you might suffer from lack of adoption as the incentives might not be enough and that 'killer app' that will make the platform boom never shows up.
I want something a bit more extensive than this. I want something plug-and -play and portable where I can unplug my data in seconds and plug it back in when I want/need to.
I had this idea in the early 90s that so much data was being and would in the future be collected about us, that citizens should be able to incorporate into abstract entities that served as their data proxies and to which commercial entities would attach their tracking. So Corp UUID xyz bought gas at an Amaco station, not Mr John Doe.
I gave up on this idea as the www form of internet arrived and e-commerce, adserve, cookies, and all other modern forms of surveillance capitalism flourished.
The idea that you can somehow control someone’s observation of your activities, and that you are entitled to privacy, or obscurity, or to be hidden, or forgotten, I realized ( or came to think ) was quixotic and antisocial.
It is a conflicted and torturous path to take, because Many real abuses occur and a lot of harm is done with data that is collected and analyzed.
I think a statutory right to partake in ownership of your data sounds sensible, but I think it too is unworkable and going in the wrong direction - to scarcity, fear, and the complement of fear is aggression.
> Tim Berners-Lee wants to put people in control of their personal data
This seems like a movement that would be at odds with the interests of people who fund elections. It could easily trigger the bazillionth instance of corps and legislators uniting to squash a public interest.
I don't see how GDPR would break this. It would be compatible with what that hopes to acomplish. This is a technology solution which helps you to enforce GDPR redactions.
Unfortunately at the bottom is a copyright notice. Nothing is going to “put people in control of their personal data” as long as we have copyright. Otherwise lots of your “personal data” will remained locked up with corporations.
You do not have control of your data when you have no idea or control over the software and hardware that you store your data on. A situation that will never change as long as we have #ImaginaryPropertyLaws.
I keep seeing data-hubs and identity-providers touting themselves as solutions to the web's privacy issues, but I don't see how they actually solve anything.
It seems like an attempted technical solution to a social problem to me.
The real problem with data based services (ads, Google search, etc) is really that a bunch of data is collected opaquely, unethically, and in some cases illegally. The whole system including data brokers and real time bidding is out of control.