Hacker News new | ask | show | jobs
by rakoo 258 days ago
This is a beautifully written introduction to the architecture of AT, but after much consideration I will still remain on ActivityPub for the time being.

I love the idea to define data formats first, and then build on top of that. It's the only way we should do everything, because if you have the data, everything can be re-built on top. Unfortunately the way AT works is all contained in here:

> Social aggregation features like notifications, feeds, and search are non-negotiable in modern social products. [...] Coincidentally, that’s the exact mechanism you would use for aggregation. You listen to events from all of your app users’ repositories, write them to a local database, and query that database as much as you like with zero extra latency. [...] This might remind you of how Google Reader crawls RSS (rip).

In order for the social aspect to work, all data must at some point or another be aggregated in a single place. Said single place must then be huge, as it scales linearly with the activity of the network; in a still-capitalist world this means that this single place will always be run and led by money, unless some extraordinary volunteers-based project like Wikipedia springs up. The example of Google Reader is to the point: it was the biggest tech company at the time, provided a service for free, and decided to stop because it didn't care anymore.

In fact Google Reader is a very good comparison. AT works exactly as if you had websites, each with their own RSS feed, and then a big relay called Google, providing search, feeds, notifications, ... but as we all know by being the middleman between producers and readers Google gained an astonishingly high power. That is the business model described by Cory Doctorow when he talks about enshittification. Put yourself in the middle, and everyone will depend on you.

The only way an AT based product works at scale, ie with everyone easily talking to everyone, is with one or a few mega intermediaries between everyone of us. I fear this is not going to solve any of the issues we have.

What is different in ActivityPub ? Intermediaries are definitely useful for some services, but once your network is built you don't need them anymore: content flows directly between the repository, no middlemen needed.

In short: if we want a single network at large scale, AT requires large scale centralization points, while AP certainly needs them but could survive without them. Either we face that, or we start exploring and living within small-scale networks

2 comments

Well, atproto can scale down too if you’re content with a subset of data. In other words, it’s not that atproto requires you to have a full network, it’s that it lets you build apps over whole network. ActivityPub doesn’t offer an approach to do that. So we’re not comparing apples to oranges.

I do think that you’re underestimating the value of open network for large-scale aggregation. Yes, for big open world you need big indexes. But indexes don’t have to always done by single entity. Some can be shared. Resources can be pooled for apps that need a materialized index of the same data. We haven’t really seen how this plays out yet because big indexes only existed behind the doors so far.

And if all else fails, limiting the scope (by time or community) works in atproto too. It’s just… not as fun :)

The subtext of my comment is that the people doing and pushing for atproto are not building for small networks but for big networks. Yes, technical solutions can be found; the problem is, as usual, not technical but rather social. What kind of organization can build and maintain world-scale indexes ? What kind of people can be in those organizations ? There is very little reason to believe that those intermediaries will not behave the same way intermediaries have always behaved, if we don't also challenge the socio-economico-political system they are developed in. Which developers have an automatic reaction to not do. That's where ActivityPub changes: the social aspect of what it takes for viable communities cannot be evaded. Sometimes with bad consequences though.
I guess I just don’t believe that any solution where you don’t have full view of the network is aligned with what normal people want. Maybe AP is on a bigger mission to teach people that they’re “wanting wrong things” and actually you should enjoy a system where everyone sees a different like count and half the replies are missing. I think it’s a dead end.
That's a purely personal point of view but I don't think the AP people claim that "people want the wrong things", but rather that "the things you want have a cost and we're not hiding it".

The fact that likes count and half the replies are missing is not specific to AP but to implementations not willing to actually follow the AP community: in fact the SocialHub (https://socialhub.activitypub.rocks/) community is the place where all coordinated development happens, and solutions to those issues have already been designed and implemented in multiple softwares, with the notable exception of Mastodon. Maybe that's the issue: people keep looking at Mastodon to understand AP, but Mastodon is one of the worst examples of AP, even when talking only about the technical domain. It doesn't implement the C2S API, it doesn't have portability, likes counts and missing replies as you said, ...

Mastodon/AP is difficult to discuss because pointing to flaws of Mastodon leads to people saying "it's just a Mastodon problem", but AP doesn't by itself specify much so it's hard to critique it too. If there's a "flavor" of AP that's competitive with what atproto solves (can "walk away" without cooperation, can "revive" and "remix" data from other apps, can "fork" products with all their data), I'd like to read a condensed summary of that architecture.
AP, even in its purest simplest form, already allows to "revive" and "remix" data: you have the JSON of the data, you can do whatever you want with it. You can build a product injecting JSON of this data, just like in AT, but those product don't really exist so even if I can say "yes you can fork them" it's all talk.

There's no talk about migration in the core spec, but how is it in ATproto in practice ? Does everyone carry their entire repository, live from their own PDS, ready to be remigrated somewhere else ? It does feel like in AT-in-practice people just have a PDS and that's it ? (I've never used it, I might be totally wrong)

Really, if you want a simplistic view of AP, it's easy: it's an INBOX where you receive content, and an OUTBOX where you send content, and the content is activitystreams-formatted JSON. Anything else just makes it easier to work with.

That's why I always say AP-in-practice. It handily avoids any "but the spec says!" diversions.
> That's a purely personal point of view but I don't think the AP people claim that "people want the wrong things", but rather that "the things you want have a cost and we're not hiding it".

I think that's a great way of putting it, and it it's at the root of a lot of problems we have today. Our society increasingly encourages people to make money by coaxing other people into doing things whose costs are hidden.

What I found most exciting while reading this article was the promise that you can “up and leave” and take your data with you without breaking links, because the links are based on a domain name you control.

This is not so in ActivityPub. The data you post is owned by/controlled by the instance you're on. In the language of the article, you're still a row in somebody else's database.

I was on Mastodon for a while until the instance I was on shut down. I naively assumed that I could export and re-import my posts but that was not so. Everything is deleted. I technically have an archive of it in the form of some JSON files, but as illustrated by the article, this is now dead data. The same will happen again if/when my current instance shuts down. The only way around it is to run my own instance, which for the vast majority of people is a ludicrous proposition.

If we're talking strictly ActivityPub, they're exactly the same: servers where your data lives. AT's PDS give you access to your data, but proper AP servers also do that: you have your collections, and all your activities in them. The trick is to recognize which software is actually a proper ActivityPub software, and unfortunately Mastodon isn't one of them. The current issue is not with ActivityPub.