| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by __justplaying 595 days ago
	didn't say it was cheap!

2 comments

nightpool 595 days ago

But why is it required? Do you really need a copy of everyone's data locally? If the only way to self-host bluesky is to have an entire copy of the entire database, that seems like it's really bad from a scaling perspective.

link

half-kh-hacker 595 days ago

What else would "self-hosting all of Bluesky" mean other than a copy of the entire site? If you just want to participate in the network host a PDS, which only stores your own posts.

link

nightpool 595 days ago

Surely there's some middle ground between only hosting your own data and being reliant on another site to keep track of your following / followers and hosting a duplicate copy of the entire network?

link

steveklabnik 595 days ago

For sure. If you just want to host your own data, you can do that. A PDS for you and maybe some friends is very small and cheap to host.

link

nightpool 595 days ago

My understanding though is that having a PDS on its own is useless without an AppView to collect the data from the relay? Or am I misunderstanding the architecture here? https://docs.bsky.app/docs/advanced-guides/federation-archit...

link

steveklabnik 595 days ago

I'm talking about the case where you wanted to run your own PDS and use all of the other infrastructure being run by Bluesky.

If you fully want your own copy of everything, then you'd want to run a copy of everything. But you don't have to. It really depends on what your goals are. That's why the post is about the maximal scenario. "Just your own PDS" is the minimalist scenario. But I think it's the one that makes sense for 95% of users who want to self-host.

link

half-kh-hacker 594 days ago

Your following list is stored in your own repo, so it lives on your PDS. You can theoretically have partial replicas of the network but nobody has bothered yet; if you want to make software like that, a good start would be subscribing to the firehose and filtering down to DIDs you care about / supplying the watched DIDs parameter to a Jetstream instance

link

fiatjaf 594 days ago

The middle ground you're looking for is impossible in the AT protocol, it is however what the Nostr protocol is aiming towards.

link

jazzyjackson 595 days ago

"self host an entire copy of all user data" is a pretty cool capability to have, kind of proof that the infrastructure is really open and forkable. you seem to have misunderstood OPs goals. Serving your own data from a personal data server is a much less arduous affair.

link

galactus 595 days ago

Uh, it is not required. You can run only a PDS if you want to self host your data and everything will work.

But it is indeed very cool that you can actually host a relay if you want (for fun, learning, or whatever reason)

link

bombcar 595 days ago

Ten terabytes of spinning rust is only $100-$300 or so, that's not bad at all.

link

jonstaab 595 days ago

My point is not the current size, it's the eventual size if bluesky succeeds. Facebook ingests 100TB/day. Self-hosting a bluesky relay isn't (won't be) a thing.

link

galactus 595 days ago

It could be a thing. Not for individual tinkerers but for companies. The fact that today, with already 14 million users, is still possible for an individual to host it is amazing.

link