Hacker News new | ask | show | jobs
by jakebsky 849 days ago
Hey HN, the engineering team at Bluesky is especially excited to get to this point! We're happy to help answer questions and help anyone trying to run their own PDS host. Things should work pretty well for self-hosters right now, but we're standing by to help if there are any problems.

Technical details and the installer are in the GitHub repo https://github.com/bluesky-social/pds

And we're on Discord available to help: https://discord.com/invite/UWS6FFdhMe

14 comments

Unrelated to engineering but the recent rebrand to a dead butterfly logo[1][2][3] may be off brand for a platform wishing to communicate a more open, social Internet built on first principles and scientific rigor.

[1]https://www.emilydamstra.com/please-enough-dead-butterflies/

[2]https://news.ycombinator.com/item?id=14460013

[3]https://bsky.social/about/blog/12-21-2023-butterfly

Pedantic lepidopterists of the world, unite!
> built on first principles and scientific rigor.

Are you joking? This is private enterprise we're talking about. We'll all die before this company or anything similar is built on "scientific rigor" unless it directly relates to their profit margins.

> If you hadn’t previously noted the difference between a living and a dead butterfly, I’m afraid you will now begin to see dead butterflies EVERYWHERE, as I do.

I didn't know this (as most of us I'd guess). It was an interesting read though, thanks.

It's true. I read it some time ago...they're everywhere. You can't unsee it.
I think the value of a symbol is in the idea it communicates. Most people don't see a dead butterfly. They just see a butterfly. The reality of whether it is more or less like a dead butterfly doesn't especially matter unless a significant amount of people interpret the symbol that way.
It would probably be worth clarifying in that repo what the license is for both the code in that repo and the code that it's actually running. It looks like it's just a very thin wrapper around @atproto/pds, which is MIT/Apache 2.0 [0], but the repo you link to has no license.

Edit: now it has one! Thanks!

[0] https://www.npmjs.com/package/@atproto/pds

Yup, it's MIT/Apache 2.0. We'll fix that. Thanks for the heads up.
Awesome! Why did you choose Caddy as a proxy for PDS? (Caddy creator here.)
Thanks for Caddy, Matt! Some of us on the team have been using Caddy for years, for many of our projects. Because it's so simple, sufficiently high performance, and has lots of nice features.

The on-demand TLS certificates with an "ask" endpoint is especially useful for the PDS use-case. Because there's generally a wildcard DNS name that is used to give each new user a domain handle (@alice.example.com) but we don't want to be vulnerable to a TLS certificate DoS/rate limit situation.

Great reasons -- glad to hear that! Let me know if you encounter any hiccups or have feedback.

Love the fresh federated model btw!

Even if it may be simple in some areas, it doesn't handle edge cases such as https://github.com/caddyserver/caddy/issues/1632 in other areas out of the box unlike other server software.
That is a bit unfair, as it is intentionally not doing so. You may disagree with it, sure, but as it stands I think your comment implies oversight or immaturity, which is evidently not the case reading the discussion on the issue you linked.
>That is a bit unfair, as it is intentionally not doing so.

That doesn't change my point. I am pointing out a an easy pitfall Caddy users can fall in since it is not automatically handled for them as it is with other server software, nor is it pointed out in the documentation fkr Caddy. Simple server software would avoid these pitfalls automatically for users. So while it now be simple to get https working, properly configuring the server is now more complex to get right.

An intentional pitfall doesn't mean it isn't a pitfall.

you have been repeatedly posting this incredibly niche complaint for years at this point
Is it possible you're vonfusing that user with me? I used to be relatively vocal about this issue on HN. For reference, that's not me.

And it's probably not niche if dozens of users are posting about it for years.

I have only brought this up once before on HN and it was over 2 years ago. Not adopting a new project because it is missing something niche is an extremely common reason why people stick with tried and true, mature software. I do not see anything wrong with pointing out niche issues because to some people these issues are important. Because it's broken out of the box it is allowing people who aren't aware of this problem to continue to setup broken sites. Even caddyserver.com. is broken.
Curious. What is the use case here? I’ve spent tens of thousands of hours of my life on the Internet and a lot of that as a sysadm and I’ve not once heard of people accessing or linking to sites this way.
Not for nothing, but when accessed from this HN app on an iPhone, Apple’s website with a trailing dot does not render correctly.
Hi. If the protocol is open, the software is free and the main instance openly federates with self-hosters, what's the monetization strategy here? Clearly it's not "harvest all the data and figure it out later" as that avenue seems to be shut down internationally by strengthened privacy laws and ads don't work well with federation and third party clients. Is "grow first, figure out how to make money later" still a viable strategy in this economy?
managed hosting perhaps? It works in the email industry at least (Google and Microsoft nearly dominate the email biz)
Yeah but that assumes ATP reaches anything even remotely approximating the ubiquity of email rather than ending up like Google Wave (not literally by being handed off to Apache - which took Wave behind the barn in 2018 in case you're wondering what happened to it).
Given the PDS server works on ports 80/443 and I'd like to use a domain (@nytimes.com in the documentation, but say @example.com), how does it interoperate with existing services that already operate on @example.com , for example a website, blog, cloud.

I'd imagine this use case is quite common for self hosters. If it can't operate alongside an existing, say, nginx on this port, are there recommended alternate practices?

I'm excited at separating identity from hosting, of which self hosting identity gets us closer.

Hi, what is the status of integration with the activitypub protocol? as its currently the most popular protocol in federated social media
That was quit the mess. Ryan Barrett is a smart guy and seems quite nice, but it was very ill-advised to unilaterally decide to build an opt-out bridge. In general, if users one platform A want their stuff to be on platform B, they'll find a way to make that happen. If someone else takes it upon themselves to copy everything from A to B, people understandably get pretty bent about it.

If it had been an opt-in system, the response would probably have been far different.

I'm surprised that the tool in question is Bridgy Fed. Bridgy Fed has existed for a long time and is a very useful tool. Its alternative, Bridgy, has also been used to bridge between closed social networks and the open IndieWeb.

Why are Fediverse people only angry about it now? It's an open protocol. If you want privacy, don't publish something for the entire world to see. That's just basic common sense. At the very least, use Mastodon's privacy controls. The Fediverse is not special here, it doesn't get to destroy the open Web for everyone else.

In general, people on the Fediverse want to be able to make local moderation decisions; the way that extends to other federated sites is by not federating with them. Most Fediverse sites will not federate with sites that have bad or nonexistent moderation (or simply incompatible moderation policies). Bluesky's architecture basically means that it is one big unmoderated site. The normal reaction of Fediverse admins is then to block it.

As a controversy, it's been blown out of proportion. It's just Fediverse admins setting the moderation policies for their own sites, as always.

I agree that they should block it, I just don't think they should be harassing Bridgy Fed developers. The GitHub issues were pretty insane.
Well first not everyone on the fediverse is opposed to the bridge. I agree that public is public. But there are concerns about moderation being incompatible, it’s normal to voice them.

As for the fediverse destroying the open web for everyone else, I think you’re hyperboling quite a bit, the fediverse has done mountains to make social media more open, probably more than everyone else.

Yeah you're right, I think I did overgeneralise there. I was meaning more of the culture of "Mastodon users"; Mastodon itself has done a lot to help the open Web too.

Though I think "voicing concerns" is a bit of an understatement. I feel really bad for the developer of Bridgy Fed, working on their passion project and just getting caught up in all this heat and harassment.

I'm one of the persons who blocks the bridge, not because of privacy concerns but because it bluesky, specifically. I do not like the idea of for-profit, vc-backed entities being given data, or any kind of decision. We all know exactly where that leads, a term has been coined, mountains have been wrtten about it and yet it still happens.

It's the same situation with Threads.

As for privacy I disagree with you. There's nothing because nothing has been discussed, but the technical feasability should never dictate what we want as a society. When a family member dies, even though the news is known you know how to behave, who to share that information with, what to say. Would you be okay with a company coring up to you and saying "hey we learned your mother died, would you like to tweet it ? It is free !"

I half agree that technical feasibility isn't everything. For example, I can murder someone with a knife. But it's not what we want as a society.

But it's not never. For example, I can see your post. So I can send a screenshot of your post to my friends to dicuss it. I can't see your hard drive contents. So I shouldn't hack into it and send a screenshot of your hard drive to my friends to discuss it.

So technical feasibility influences what is reasonable to do as a society. It's not "feasible = reasonable", or else murder would be reasonable, but it definitely does influence it.

And in this case, I believe what the Fediverse people commenting in the GitHub issue want to be unreasonable. It is unreasonable to publish something publicly, on a federated network, where privacy controls exist (but are not being used), and then claim to have an expectation of privacy in a public space, especially when such bridges provide utility and benefit to others that just can't happen if it's opt-in.

It is reasonable to block it. It is unreasonable to expect everyone else to restrain themselves from using public data in the spirit of the open Internet. It is especially unreasonable to harass non-profit bridge developers in this case. That's not a social solution, just harassment.

Note about copyright: that's not a path one should go down, because it'll make the Fediverse illegal as a whole. It's probably fair use, anyway.

Note about AT protocol: yes it's designed by a for-profit, but it's good. Just because something is for-profit and VC-backed does not mean it will enshittify; take Element for example. It solves a lot of issues that people were having with Mastodon such as global full text search and a global feed. I would use it if it only had more relays to spread out the control.

Public is public.

And someone else will just go build an opt-out (or maybe even no opt-out!) bridge.

Nah. Consent is a thing and this wasn't consensual. Yes, the posts were publicly accessible, but the intent of posting to Mastodon isn't to have it show up automatically on another network. It's technically possible, yes. It's still a dick thing to do and it pissed people off.

And again, it wasn't about Bluesky in particular. If Google announced that they were going to ingest all Mastodon content and post it in a new Google Groups kind of thing, they'd be pretty understandably upset about that, too.

In general, "if I wanted my stuff on Bluesky, I would have put it there". It wasn't the bridge creator's decision to make.

Public = consent for the public to see it. That includes the public on Bluesky. It was consensual. And the ruckus was in fact about Bluesky in particular. That's why the same project already supported other protocols without a big ruckus.

In general, "I want my stuff on Bluesky but don't want to deal with cross-posting to multiple different platforms and keeping up with responses on all of them"

And, "I want my stuff on whatever platform people want to read it on without having to individually approve each one" (which is quite literally the entire point of public posts on Mastodon).

OH - and it wasn't the bridge creator's decision anyway; it was the decision of people on Bluesky to follow you that would trigger your posts to be federated, so...

> "if I wanted my stuff on Bluesky, I would have put it there"

How about "If I wanted my stuff on the your Mastodon server, I would have put it there"?

"If I wanted my Mastodon content on your RSS feed, I would have put it there".

How about "If I wanted my stuff on the Internet, a publicly available internet, I would have put it there".

This tribalism around network/brands/protocols is beyond stupid. The thing that is killing Twitter is its closedness and the assumption that the means of communication is what matters. It's not. Let open protocols be open.

If people want privacy, then they should use a secure communication protocol and not a social media network.

> Yes, the posts were publicly accessible, but the intent of posting to Mastodon isn't to have it show up automatically on another network.

I thought that was the point of activitypub.

>If Google announced that they were going to ingest all Mastodon content and post it in a new Google Groups kind of thing, they'd be pretty understandably upset about that, too.

exactly like they did with usenet without any issue?

> Consent is a thing and this wasn't consensual

The whole point of a fediverse is it's a federation. Therefore there is implied consent to copying from one instance to another.

> but the intent of posting to Mastodon isn't to have it show up automatically on another network

Mastodon isn't a network, the network is the fediverse. Mastodon is some software that runs on the network.

What thing is consent?

Mastodon is an odd sort of network, there's more blocking than I expected and it somehow seems as if blocking is an intrinsic part of the design. In Mastodon, blocking looks like a choice one makes for whatever reasons, not an unloved measure needed for fighting abuse.

As if the design doesn't tell users "you can follow people in the fediverse" but rather "your ability to follow people in the fediverse is limited by you and three other parties and the software isn't among the three".

So… if the mastodonish idea of consent doesn't extend to all of the fediverse, what makes bluesky different from some unvetted mastodon site run by weird people? If the poster's/follower's/would-be follower's consent isn't taken for granted in one case and isn't taken for granted in the other, what makes the two cases different? There obviously is a technical difference, but what is the difference wrt. consent?

> If Google announced that they were going to ingest all Mastodon content and post it in a new Google Groups kind of thing, they'd be pretty understandably upset about that, too.

Except that's not what the bridge does, at all. It only follows you on someone's behalf when someone on Bluesky specifically requests to follow you through the bridge.

I’m a sucker for a particular mix of condescending plus wrong.
And Fediverse admins will block that bridge, just like they would any other site with bad/nonexistent moderation, and will advise each other to block it. That's just how moderation works in the Fediverse. I guess it's sad that, unlike the admin of an instance with bad moderation, the bridge operator can't do anything to fix the problem, but in the end, that's their problem.
Hey! Congrats on the release.

Does the AT Protocol only optimize for Twitter-like flows, or does it allow for other types of social applications to be built like Activitypub? For example a reddit-like social media.

Currently, atproto works probably best for public social apps, like microblogging, forums, etc. So yes, it's definitely possible to build a reddit-like social app on atproto.

Part of the change today is that the PDS and Relay[1] now support non-app.bsky record types. This is quite new, so there could be issues, but we're prepared to fix any issues that crop up.

1. https://bsky.social/about/blog/5-5-2023-federation-architect...

> microblogging

Would it be possible to use it for macroblogging, i.e. long posts with markdown markup, embedded images, etc? If so is there a python library tghat implements atproto?

Yes, it should be totally possible to build a blogging system on atproto. And the "app.bsky" API should serve as an example for almost all of the functionality required.

Another really neat aspect of atproto, is that apps can interact theoretically. So you might create a blog system but use "app.bsky" (Bluesky) for comments.

OAuth support is coming soon as well, which is a big step in simplifying auth.

Congratulations on the release! If I may ask a question - is it possible to register an account without a phone number on a 3-rd party server?
Thanks!

Yes, it's totally up to a PDS operator to decide how they create user accounts. It's also not required on the Bluesky PDS service any longer, in most cases.

By default the self-hosted PDS requires an invite code, to prevent random people from creating an account. Later other options will exist, including OAuth support which is coming soon.

That's great, thanks!

> It's also not required on the Bluesky service any longer, in most cases.

That's also nice to hear - when last time I tried to register an account (shortly after the free registration launch) the phone number field in the registration form was marked as required, if I am not mistaken.

Yeah, you're right, it was. That was temporary measure during the public launch to prevent spam/abuse. We've made some improvements here recently.
I'm a little confused why the PDS server is both dockerized and has an installation exclusive to Ubuntu/Debian.
Yeah, there's nothing preventing someone from running the PDS server on other distributions. The installer just does a few convenient things for you (like install Docker, opens port 80/443 using ufw, etc) and we haven't added and tested support for other distributions.

There is a Docker compose file in the repo, and advanced users shouldn't have any problems running the code on another distribution or even without Docker if they prefer.

Advanced users can just view the installer script as documentation.

Why do you need to open ufw if it runs in Docker? Docker does its own routing magic and will happily blast right through any ufw rules.

Very cool to see this available though, I might have to try it out later this week!

Are there any independent projects implementing the AT protocol?
There are a number of independent projects using atproto in various ways.

There's an (incomplete) list here: https://docs.bsky.app/showcase

And the protocol is documented here: https://atproto.com

Thank you, I might be searching for the wrong things, but I don't see any independent servers. There's clients, libraries, bots, but no servers, am I missing something?

My question was motivated by the fact that from the outside the AT proto ecosystem looks pretty monocultural, and personally I don't trust that. :)

Federation was just opened up with this announcement. I don't think there was a lot of energy for working on independent PDSes until after this has happened. In the past day, a bunch of copies of this reference PDS have been deployed. We'll see how things change in the future.

Basically, you're right, but just because you're asking early on. This is about to change real quick.

Sorry to be dismissive but if people need a reference PDS to implement a protocol, I kinda lose faith in its expressiveness.

I'm contrasting this through the prism of having seen people implementing ActivityPub software as early as 2017, when it was merely a draft and everyone was complaining about how many things had been left out.

There are dozens of things being built on the protocol. Just not PDSes. And it’s not because it’s difficult, but because there’s no point until there are others to try and connect to.
I've also had pretty good luck using the GH topics for finding obscure projects https://github.com/topics/atprotocol https://github.com/topics/bluesky https://gitlab.com/explore/projects/topics?search=atprotocol (empty, but I usually check there, too)
Will this work for bare metal?

I use BSD, and all I see is a installer for Debian/Ubuntu.

No guide in sight for bare metal nor telling you what services/software are required.

yeah it works fine on bare metal, you'll just have to do a bit more set up work yourself (https terminating and such). The installer script should be instructive in how to run it but you'll have to figure out the BSD specific stuff
Look into the service folder in the repo—this repo is just a very thin packaging wrapper for a JS library, which you should be able to run anywhere you can run Node.
If I wanted to create a consumer hardware product that packages the PDS host in a user-friendly interface, does the software license permit that?

Also, services like Twitter started off with a developer friendly open API, and then it got closed off when the business needed to make money off the platform. What's the difference with Bluesky?

(I don't work at bluesky)

It's MIT/Apache 2.0 licensed, so yes. However, because it's also an open protocol, even if it wasn't, you could write your own under whatever license you want.

> What's the difference with Bluesky?

BlueSky is built off of an open protocol, called AT. https://atproto.com/ BlueSky is a particular app built on the protocol. As such, there's no way to "turn off the API," as BlueSky itself is a participant in the open protocol.

They could like, re-write everything to be a central service, port the user data over to it, and then pull out from the network, but then two things would happen:

1. stuff would break, as it's no longer part of the network.

2. since there is true account portability, users could simply swap to a different PDS and client, and re-route around the damage.

Also given that it's against their entire stated mission and goals, it would be social suicide.

Ok, I think I understand the hype now — this is sweet? Thanks for the reply.

When I was thinking about rolling my own federated social service (during the whole Twitter / Mastodon shuffle), I started thinking about the negative impacts of federation. “How can I build a kill switch into this thing in case bad actors start to participate?”

Of course, that means a central authority having moderation control.

But then, part of me thought.. does there need to be a kill switch? Do we want centralized moderation control? If laws are being broken, or social issues are being pressed to the fringe (*lons X), then the ISP would intervene. Or… the responsibility of moderation shifts elsewhere than the provider of the software. (To where do moderation responsibilities shift, if anywhere? I think is still a question.)

I personally landed at the spot of: it’s really no different than what the WWW provides — it’s just easier access. But, is that easier access to self host content and media dangerous? (lol I guess I’m still conflicted about what could happen with wider ease-of-access to self hosting.)

… back to sleep. This stuff literally keeps me up at night lol

(I didn’t get to the other comments further down in this post that talk about content moderation .. but I’m seeing them now!)

Now that individual posts can be viewed without logging in, is there a way to view/load a feed without authentication?

I'm working on a client and there's a specific scenario where I want to be able to show a feed like "Top 20 - Past 3 Hours" before a user has logged in to their Bluesky account.

Gonna be that guy!

Any chance the team could create a Home Assistant add-on for this? https://www.home-assistant.io/addons/

I think the Home Assistant community would go WILD for being able to self-host their Bluesky data straight from home with just a few clicks.

It's a pretty big crowd of people. https://analytics.home-assistant.io/ 327k willing to opt-in to analytics.

we need a new version of Zawinski's Law: every system capable of deploying plugins will eventually expand until it is a full hosting solution.

I know if there's one thing I'm eager to do it's to host even more stuff in that clunky piece of shit that has half a dozen main menu items for nonsense and buries everything of interest or value under "Settings"

The add-ons are just docker containers?

It's wasteful to get an entire second machine for something that can use the resources available on the machine running Home Assistant OS

How would it interact with or extend home assistant?