Hacker News new | ask | show | jobs
by johnfn 41 days ago
Everyone wants to pin this on the Microsoft acquisition or incompetence but it seems pretty clear to me from the material GitHub has posted that AI has 10xed the amount of code being committed to GH, which has downstream effects everywhere - CI, Actions, code ingestion, everywhere. The author pins it on weird things like MS Copilot, which kind of feels like he’s listing off things he doesn’t like rather than casual favors. This is ignoring the 800 pound gorilla in the room.
23 comments

The graph in TFA shows the downtime pattern starting in January 2020. OpenAI released GPT-3.5 in November 2022 (basically December), and LLM/agentic coding didn’t really kick off in the way you’re describing until 2024, but really in 2025.

How can that explain the terrible uptime for the ~4 years post acquisition before all the AI stuff you’re talking about started?

The graph is not accurate, because GitHub's historical downtime data is not accurate.

For example, here is a Hacker News story about GitHub being down on July 28th 2016: https://news.ycombinator.com/item?id=12178449

Here's GitHub's historical uptime graph (on which this chart is based), saying there was no recorded downtime that day, or in fact that entire month: https://www.githubstatus.com/uptime?page=40

GitHub launched a new status page Dec 2018[1]. It doesn't appear as if any history before Oct 2018 was ported over.

[1] https://github.blog/engineering/infrastructure/introducing-t... [2] https://web.archive.org/web/20181211191456/https://www.githu...

That graph has bugged me since it went viral. The methodology is horseshit: https://github.com/DaMrNelson/github-historical-uptime

Just dumping HARs from devtools from a status site that hallucinates 100% uptime when it has no data. For example, all GitHub services had 100% uptime in June 1996: https://www.githubstatus.com/uptime?page=200

The graph gives GitHub Actions 100% uptime before it launched to GA in November 2019. That factors into the average uptime for every month on the graph before that. It's fully horseshit.

Looks like it's not accurate by under repporting not over reporting. So their down time was likely worse!
We don't have enough data to confirm if it's over or under reporting. This sample size of 1 is enough to prove the data is not perfectly accurate, but it's not enough to prove a skew bias in the data either way.
That's fair. We don't know.

I am making an assumption that if Microsoft saw a lot of false positive outages they would fix that, but might drag their feet if there was an outage that didn't get properly recorded (assuming it's automatic to begin with, it might be that a human needs remember to update it).

Oh please, show me a company that has ever over reported their downtime. That's silly.
Or things didn't change much at all except Microsoft forced them to be more honest in their reporting.

See, I can just as easily make up a story that explains the chart.

The subjective experience I and others report is that GitHub feels to have gotten significantly worse over the last few months. If you look at the month over month view of "Uptime history" in the cited link[1], it confirms this: it's been sub-90 (even sub-80 last month) essentially since the start of this year (i.e. when GitHub says that commit activity 10xed). Go back even a year and it's all in the high 9s.

I honestly can't explain the discrepancy between the graph in the article and the month over month stats on the same page, but the latter tracks both to my own subjective experience of GitHub and their own internal metrics.

[1]: https://mrshu.github.io/github-statuses/

I think it's just a case of brain drain, followed by reckless AI adoption which both drove the quality down.
The graph in the article is a lie, because GitHub's "historical data" is a lie.

https://www.githubstatus.com/uptime?page=3000

According to it, GitHub had 100% uptime from June to August 1996.

Yeah, I had the exact same response after reading the post. I mean, I'm all for jumping on the Microsoft hate train, but not if it misses the elephant in the room. Let's say the _perfect_ GitHub replacement spawns tomorrow? What's preventing the same infrastructure challenges of millions of lines of AI-generated code destroying it?

I think centralized code hosting is pretty much going to get killed by AI. Just like it's doing to social media.

> I mean, I'm all for jumping on the Microsoft hate train, but not if it misses the elephant in the room.

That elephant didn’t even exist yet for the first few years of poor uptime shown in the graph in TFA… I don’t really disagree if we’re talking about the recent uptime issues, but how does that explain the years 2020-2023?

It doesn't. It just means if they were having problems before, they've now been made significantly worse by AI (on the free tier). All I'm saying is that the problem is bigger than, "Microsoft sucks."
>What's preventing the same infrastructure challenges of millions of lines of AI-generated code destroying it?

There's something called "rate limits" that engineers not working for GitHub have probably heard of; it's this crazy idea that you should limit the load on your infra in order to avoid downtime. GitHub is not the first free service to ever have to deal with bots.

Saas code hosting seems to be the problem here. If companies self hosted, they could deal with the scaling problems themselves.
> Saas code hosting seems to be the problem here. If companies self hosted, they could deal with the scaling problems themselves.

If all companies did this, there'd be no free tier on Github. You get the free tier because the SaaS customers are subsidising the free tier.

> I think centralized code hosting is pretty much going to get killed by AI. Just like it's doing to social media.

Private corporate codebases are a poor fit for GH because they don't benefit from public social graph effects. And the typical codebase isn't so large as to be technically challenging to deal with with OSS tools. I'd guess they make up a substantial share of revenue.

But once the reliability is called into question, self-hosted or smaller alternatives start to look good. Although there's some trickiness there if you want to be super cautious about making sure you can get to your code+infra in case of a vendor incident, especially if you're cloud based.

I dont even like AI much and this still seem to me like yet another instance of people blaming AI for normal mismanagement and failure.
Because if you were building GitHub from scratch today you wouldn't build it the same way and would benefit from many of the technological advancements of the last 2 decades (nearly).
of all the awful things AI is doing and will be doing to society, killing centralized code hosting and social media will be its shinniest moments, both deserve to die painful deaths
Yes, the terrible sin of ... Hosting code where people can find it
I can’t remember the last time I looked for a project specifically on GitHub. I always come there via a link from another site.
hosting code where people can find it is the reason LLMs can write code, so we kind of screwed ourselves there…
How did people do it before github? Did everyone write everything with peek and poke?
Private people would keep their code locally and share the snapshot of the code using any file sharing or hosting option available.

Companies had been hosting their own CVS or later svn servers.

> How did people do it before github? Did everyone write everything with peek and poke?

I've been sharing GPL projects since 1999. We didn't need peek and poke (Both of which I have also used further in history...), but we managed nevertheless.

Prior to github I shared software on sourceforge (and others). Prior to that I published stuff on Freshmeat.

Prior to that I downloaded games others shared (not open source) on Happy Puppy.

Prior to that I used usenet to find and download games, shareware, etc.

Prior to that I used ftp to (IIRC) ftp.sunsite.edu, ftp.nic.fi, and others.

Prior to that I got news of new releases using Gopher.

Finally, prior to that, I actuallyy did use peek and poke to write software :-/

If github went away, and centralised repos went away, we'd still have something...

Sourceforge
is this a serious question
No, it was rhetorical. I remember downloading software from sourceforge, distro servers and getting drivers from random people's websites that needed to be compiled.
Why is centralized code hosting getting killed? I'm running an opensource project, >99% of the code is AI generated, could not do this without GitHub. Ai generated source code needs a place where AIs and people can collaborate. I'm expecting GitHub to be hugely successful, but mostly for an AI audience.
Because it's centralized. Your project pays the price for every unrelated project that's getting overloaded.
I'm sure the underlying infra is not a single server, so this is mostly a period where they have to adapt to higher loads due to AI becoming actually useable in the last 8 months. It's basically proof how well AI works these days. Give it a few months so they can scale and it'll get better. Remember Twitter fail whale? Growth pains that can and will be solved.
> It's basically proof how well AI works these days. Give it a few months so they can scale and it'll get better. Remember Twitter fail whale? Growth pains that can and will be solved.

GitHub's problems can technically be solved, but that doesn't mean they can be solved in a way where the economics still work out.

If AI use is 10x-ing the amount of infrastructure costs for GitHub but not 10x-ing the amount of money Microsoft brings in from GitHub then there is certainly no guarantee they will bother to solve these issues adequately.

And I'd be shocked if the revenue side of things isn't lagging way behind the extra usage post-AI-era, both because a lot of the new use is probably on the GitHub free tier, and because even on the paid tier most usage (other than CI/Actions, AFAIK) are on a fixed subscription cost per user regardless of how much you are slamming their servers and it is unclear how much they can raise that price without current enterprise users fleeing.

Twitter had a clearer goal that aligned with the financials... support more people stably, show more ads. Things are less clear with GitHub's business model where the free tier is a loss leader for the paid tier but the expansion in usage is likely to balloon the free tier usage at a far faster rate than the paid tier usage.

Also (and this part is admittedly far more speculative) if AI labs are to be believed this is still early days for AI usage and we'll still see massive usage growth over the next few years. If GitHub is already having existential trouble at the beginning of the curve, what hope do they have to scale up with their current business model if AI usage actually does ramp up exponentially?

> And I'd be shocked if the revenue side of things isn't lagging way behind the extra usage post-AI-era, both because a lot of the new use is probably on the GitHub free tier, and because even on the paid tier most usage (other than CI/Actions, AFAIK) are on a fixed subscription cost per user regardless of how much you are slamming their servers and it is unclear how much they can raise that price without current enterprise users fleeing.

I'd guess most of the costs incurred to GitHub outside of Actions as part of the enterprise flat-rate tier are a fraction of what enterprises are paying for AI in order to incur those costs in the first place.

If a company has to pay $5 extra to GitHub for every $100 of extra AI spend due to that AI use creating disproportionate load, I've got a hard time imaging that GitHub will be the thing that gets fled from.

As far as the free tier goes, it seems like there should be a path to making prohibitively-cost-incurring usage models high-friction. (e.g. limit the free Actions minutes that you get to a certain number per month.) As long as the limits are roughly proportional to the actual costs incurred, there's not too much risk of people fleeing to a competing service, because the only way a competing service would be able to undercut the costs is by taking steep losses themselves, which isn't much of a business model in order to attract people's code repositories.

Yah, the monitization bit is challanging. I'll ask my agent to click some of the ads GitHub serves it ;-)

But getting this infrastructure right is crucial for a future where most of the code is AI generated. GitHub puts microsoft in a good position to experiment and learn how to optimize GitHub (enterprise) for the future.

Nate b Jones on youtube, https://youtu.be/FDkvRl1RlT0?si=AEYlUchm_oalMSzf, argues that Atlassian might be an interesting acquisition for Anthropic, as it provide most of the context AI at enterprises need. When executed well, GitHub enterprise, can offer microsoft the same value: the context AI needs in the future.

> Ai generated source code needs a place where AIs and people can collaborate. I'm expecting GitHub to be hugely successful, but mostly for an AI audience.

Are you paying them in proportion to the resources they expend on you?

There's this thing called "sustainability", and every company needs to have it. Github cannot continue on the current trajectory where every AI-bro wants to run an agent that generates 1000s of lines of code per hour, dozens of commits per hour... and provide that for free to a few dozens of millions of users who won't pay.

That being said, Microsoft does have an opportunity here - AI-bros are willing to pay $200/m to burn tokens so Github should offer a plan for Copilot, say $400/m, that includes a repo.

If they don't ban AI agents on free tiers, they are going to be out of business soon.

GitHub hasn't changed in any positive way since the acquisition. A decade is a long time, it tells.

GitHub action, co pilot. Oh and that ugly AI search I'm unable to disable. Migration to azure.

Yes Microsoft managed to ruin the network effect. Outages? The straw that broke the camel's back.

3 months post Microsoft acquisition, GitHub expanded the free plan to include unlimited private repos.

The next year they removed the limitation on collaborators on private repos for free users.

In the last 4 years they’ve significantly improved their project management tools. I think a lot of teams can make do with GitHub Projects, they’re pretty decent.

Who knows if any of these are directly because of Microsoft or not. But there has naturally been material improvements to GitHub in the years after being bought by Microsoft.

> GitHub hasn't changed in any positive way since the acquisition.

It's more like any positive actions they have had are being outright dismissed or forgotten. They removed several restrictions that Github had over private accounts, as well as github actions. Aside from the downtimes, the Github of today is fantastic compared to pre-acquisition Github.

I'm loving it, running an opensource project mostly AI generated, i don't have to think about version control, building and testing my app, running AI code review, hosting my docs website, API and cli to enable Claude Code to interact with everything, etc.

It provides huge value for anyone running an opensource AI generated project.

How on earth is Actions a downside?
I think they meant all the security holes that have been popping up and that there is no interest from Microsoft to fix them.
They do fix them. But not at the core. Just in the frontends
Yes, I posted the same observation 3 months ago. https://news.ycombinator.com/item?id=46877226

"Yes, it (AI) will kill open source—at least as we know it. I’m convinced that GitHub and GitLab will eventually stop offering their services for free if the flood of low-quality, "vibe-coded" projects—complete with lengthy but shallow documentation—continues to grow at the current rate."

Even if this is true: Microsoft own an entire cloud platform. They have enormous codebases of their own and they employ ~200k people. It’s just not an excuse, especially because they consciously made decisions such as e.g. private repositories being free
This would make sense if GitHub themselves cited increased traffic or load shedding as their root cause, but most of their incidents from the last month seems to cite misconfigured infrastructure or operational mistakes.
I like to think that Microsoft is trying to run GitHub in Windows in their Azure cloud. And on the fact that every time GitHub is down I think of "someone updated the Windows Servers GH runs on and had to reboot everything".

While I'm 99% sure it is not true, it makes me sleep better at night. And giggle a little when it goes down.

They definitely do something with Azure. Stuff related to GitHub action runs hosted on something.windows.net, which I believe is azure.
A big part of the problem IS Microsoft acquisition. They forced them to move to Azure, which is terrible.

Around 8 years ago I was working for a company that they also acquired, and they also forced us to move to Azure. Performance was terrible and our system wasn’t just working there as it should. A few years later our service was dead and all customers moved to one of their office products.

If that's the case, we should also see the exact same pattern on Gitlab, Bitbucket, etc. Do we?
GitHub has been basically the default for free public git hosting for a long time. I was curious what bitbucket has and it looks like the free tier is so limited, I can't imagine a lot of people hosting vibe coded open source there.
10x of nothing is nothing.
What is easier to 10x? A tent or a flat?
Don't you think Microsoft ought to have thought a bit more about scale? They're not just innocent bystanders here. GitHub Copilot is a first class citizen of GitHub and so of course a lot of private enterprises are going to be using the thing that's bundled with the other thing.
Pray tell where are they going to get memory from?
They're part of the circular AI finance economy, I'm sure they can figure it out.
10x the code? Easy solution. Throttle unpaid customers or put a quota.

Either way, paid customers should not be affected.

Totally agree. People’re saying Microsoft this, Microsoft that with their Microsoft hate, but they ignore the fact that AI trend making GitHub worse, and GitHub is trying to fix.
I’m with you here. Further: Even though I disagree with it, “GitHub down, Microsoft bad” is a defensible take, but we’ve seen it ad nauseam at this point.
MS isn't solely to blame for the AI increase, but they are certainly part of the problem, including their integration of copilot into Github.
Gergely's newsletter claims its more like 2.3x.
If load has increased so much so rapidly then GitHub should be rate limiting as needed instead of basically letting people DoS them.
The 800 pound gorilla in the room being a $3T company that also happens to be one of the largest cloud providers?

C'mon.

Why have they not simply asked the 800lb gorilla to solve this problem for them?
The author mentions this and links an article that expands on it
Github had lots of outages even before AI was introduced.
For upstarts, individuals, artists and idealists, Github was a means to reach and distribute code reliably to a large number of people on the planet. Is that true today? Will it ever?

97% of code coming in is AI slop. It's owned by an evil, rent seeking corp. Reliability is a flaming dumpster fire. And everything you commit there will be used to train more AI.

Github _is_ sinking.

Got me thinking, if 99% of code pushed to GH is generated by Claude, GH just becomes a free Claude distillation service. Gotta ban it on natsec grounds obviously.
And why is it wrong? The logic is there:

- Microsoft committed to AI. - AI slop is increasing the costs for maintaining/running GitHub. - GitHub is sinking.

This is interconnected. I can think of numerous other ways how this would be handled. But Microsoft went the AI slop way already. There is no way back for them.

We want to thank you for your heroic service in our defense, sir. We really need people like you who know in what side they're at.

Microsoft investors