Hacker News new | ask | show | jobs
by bdcravens 2247 days ago
So you're comparing your scalability with a company with over 40m users and 100m repos.

Can you talk about the geographic distribution of your 10 servers?

1 comments

I would like to remind you of my earlier point:

SourceHut is not the same scale as GitHub. This does not change the fact that SourceHut is faster and more reliable. We have an advantage - fewer users and repos - but still, that doesn't change the fact that we're faster and more reliable.

This has been objectively demonstrated as a numerical fact:

https://forgeperf.org

And yes, 9 of those servers are in Philadelphia (the other is in San Franscisco, but it's for backups, not distribution). That doesn't change the fact that, despite being more distant from many users, our pages load faster. In this respect, we have a disadvantage from GitHub, but we're still faster.

GitHub and Sourcehut are working at different scales. That doesn't change the fact that SourceHut is faster.

I was considering your claim:

> we use a distributed architecture

> SourceHut is faster

I wasn't questioning that some of the web features are fast. I'm sure when Github was 10 servers their pages were fast too. I suspect if I threw Gitlab on a 9-server cluster on AWS they'd also be quick.

Not geographically distributed, but distributed in the sense that different responsibilities of the overall application are distributed among different servers, which can fail independently without affecting the rest. Additionally, the mail system on which many parts of SourceHut relies is distributed in the geographical sense, among the hundreds of thousands of mail servers around the world which have standard and 50-year-battle-tested queueing and redelivery mechanisms built in.

And yes, throwing GitLab on a 9 server cluster on AWS might be fast. But, I'm ready to bet you that SourceHut will be faster than it still, and I have a ready-to-roll performance test suite to prove it. And I know that SourceHut is faster than GitLab.com and GitHub.com, and every other major host, and you don't have to go through the trouble of provisioning your own servers to take advantage of SourceHut's superior performance.

> This has been objectively demonstrated as a numerical fact:

While your tests are indeed objective, I don't think they're very useful. For example, why does your performance test ignore caching?

GitHub's summary page loads 27KiB of data for me unauthenticated, which is about 6% of the 452KiB you're displaying in your first table. The vast majority of developers who browse GitHub will not be loading 452KiB of static assets every single page load.

Anecdotally, GitHub's "pjax" navigation feels about as fast as SourceHut on my aging hardware.

Even with caching, SourceHut is a lot smaller than that. SourceHut benefits from caching, too - the repo summary page comes from 2 requests and 29.5K to 1 request and 5.7K with a warm cache. And in many cases, the cache isn't the bottleneck, either - dig into the Lighthouse results for specific pages to see a more detailed breakdown.