Hacker News new | ask | show | jobs
by ddevault 2247 days ago
Who's to say? It's not GitHub scale, and even if everyone in this thread moved to SourceHut, it still wouldn't be GitHub scale, but it would be serving your needs just fine. I feel totally comfortable recommending SourceHut over GitHub as a service which can be expected to have better uptime and performance, because it is a fact - even if we operate at different scales.

And I believe sr.ht would beat out GitHub at their scale anyway. The services are an order of magnitude more lightweight. And the design is more fault tolerant: we use a distributed architecture, so one part of the system can go down without affecting anything else - as if GitHub's issues could go down without anything else being affected. And many of our tools are based on email, a global fault-tolerant system, which would allow you to get your work done more or less unaffected even if SourceHut was experiencing a total outage. We'd automatically get caught back up with what you were up to in the meanwhile once we're online, too.

I've spoken to GitHub engineers about some of the internal architectural design of GitHub, too, I'm confident that SourceHut's technical design beats out GitHub's in terms of scalability. And, despite already winning by a good margin, I'm still spending a lot of effort to push the envelope further on performance and scalability.

2 comments

> Who's to say?

And then you go on to say it. I'm glad that SourceHut exists, and I like many of its principles, and it's probably better designed too, but walking into a thread where someone is having an outage and then claiming that you'd do much better is in poor taste no matter how good you are or how many of your services work offline.

I responded directly to someone who said they were considering alternatives, and wouldn't've otherwise.
Right, and I think it is great to bring up how your service can handle outages better than GitHub would due to it being decentralized. The part I have issue with is saying that you'd do better than GitHub about keeping your site up, pointing to the issue that they are in the middle of resolving–that just seems like kicking them while they're down, especially since you haven't actually shown that you can do better. (Yes, you have good uptime in the past, but I don't see what's stopping the power going out to some of your servers, or you pushing a bug into production, or any number of other things that shouldn't go wrong but often do, especially as the number of users increases.)
>what's stopping the power going out to some of your servers

Redundant power supplies

>pushing a bug into production

Nothing, but again, SourceHut is demonstrably better in this regard: because it's distributed, a bug in production would only affect a small subset of our system, and the system knows how to repair itself once the bug is fixed.

And I don't think I need to apologise for kicking Golaith while he's down. Someone said they want alternatives, so I pitched mine with specific details of how it's better in this situation, and that doesn't seem wrong to me. I would invite my competitors to do the same to me. We should be fostering a culture of reliability and good engineering - and if I didn't hold my competitors accountable, who will? "Here's an alternative" has more teeth than "I wish this was better."

> SourceHut over GitHub as a service which can be expected to have better uptime and performance, because it is a fact

Most of us could throw any of the open source solutions on a $20 Linode instance and probably have excellent uptime. How many active repos do you host, and on how many servers?

About 18K git & hg repositories, for about 13.5K users. We also run about 5,000 CI jobs per week, including for some large projects like Nim and Zig, Neovim, OpenSMTPD, etc. We have 10 dedicated servers at the moment. And I didn't throw an open source solution on these servers - I built these open source services from the ground up.
So you're comparing your scalability with a company with over 40m users and 100m repos.

Can you talk about the geographic distribution of your 10 servers?

I would like to remind you of my earlier point:

SourceHut is not the same scale as GitHub. This does not change the fact that SourceHut is faster and more reliable. We have an advantage - fewer users and repos - but still, that doesn't change the fact that we're faster and more reliable.

This has been objectively demonstrated as a numerical fact:

https://forgeperf.org

And yes, 9 of those servers are in Philadelphia (the other is in San Franscisco, but it's for backups, not distribution). That doesn't change the fact that, despite being more distant from many users, our pages load faster. In this respect, we have a disadvantage from GitHub, but we're still faster.

GitHub and Sourcehut are working at different scales. That doesn't change the fact that SourceHut is faster.

I was considering your claim:

> we use a distributed architecture

> SourceHut is faster

I wasn't questioning that some of the web features are fast. I'm sure when Github was 10 servers their pages were fast too. I suspect if I threw Gitlab on a 9-server cluster on AWS they'd also be quick.

Not geographically distributed, but distributed in the sense that different responsibilities of the overall application are distributed among different servers, which can fail independently without affecting the rest. Additionally, the mail system on which many parts of SourceHut relies is distributed in the geographical sense, among the hundreds of thousands of mail servers around the world which have standard and 50-year-battle-tested queueing and redelivery mechanisms built in.

And yes, throwing GitLab on a 9 server cluster on AWS might be fast. But, I'm ready to bet you that SourceHut will be faster than it still, and I have a ready-to-roll performance test suite to prove it. And I know that SourceHut is faster than GitLab.com and GitHub.com, and every other major host, and you don't have to go through the trouble of provisioning your own servers to take advantage of SourceHut's superior performance.

> This has been objectively demonstrated as a numerical fact:

While your tests are indeed objective, I don't think they're very useful. For example, why does your performance test ignore caching?

GitHub's summary page loads 27KiB of data for me unauthenticated, which is about 6% of the 452KiB you're displaying in your first table. The vast majority of developers who browse GitHub will not be loading 452KiB of static assets every single page load.

Anecdotally, GitHub's "pjax" navigation feels about as fast as SourceHut on my aging hardware.

Even with caching, SourceHut is a lot smaller than that. SourceHut benefits from caching, too - the repo summary page comes from 2 requests and 29.5K to 1 request and 5.7K with a warm cache. And in many cases, the cache isn't the bottleneck, either - dig into the Lighthouse results for specific pages to see a more detailed breakdown.