Hacker News new | ask | show | jobs
by Smudge 3680 days ago
I generally see two reactions to the "one codebase to rule them all" approach (used by Facebook, Google, et. al):

1. Holy god why would you let your code grow to such a massive, interdependent scale? Just release everything separately and versioned so that breaking changes don't affect everyone all at once. The idea of git being a bottleneck is absurd and you are using it wrong.

2. This is a very reasonable, practical approach to sharing code across a company. It reduces siloing and ensures that major refactors can happen in one pass without a ton of coordination. Better to fix the version control system than waste endless resources refactoring millions of lines of code.

Both reaction is valid. Having worked in both styles of codebase, I recognize that there are trade-offs in either case. The optimal solution depends on the project and the team.

Sometimes the path of least resistance--that is to say, the path to getting things shipped and, in turn, making money--is to let the codebase grow organically and worry about cleaning up any messy interdependencies later, once you have a better idea of what code you even need to keep around. In this scenario, it's important to recognize that developer efficiency is going to be an uphill battle in the long run, but if you are proactive about maintenance and tooling improvements then this approach can still be relatively painless.

Other times, especially when you're working on a tried-and-tested product with a clear API and a dedicated team, it can be productive to split it out and let the team manage their own versioning and releases. This becomes especially useful if the product is open source. (For instance, I wonder how Facebook manages its open source releases relative to its shared Mercurial codebase.) In this scenario, developer efficiency is usually less of a problem, as proper use of versioning can ensure faster, more agile updates to each product. But the downside is that your company as a whole can end up in a kind of versioning hell, where every project depends on a different version of every other project, and keeping everything up to date can require a huge amount of coordination.

So, in the end, pick your poison. My reaction, years ago, was more along the lines of #1, but I used to be much more of an idealist earlier in my career.

4 comments

Approach #1 lands you in a dependency hell where you have to maintain multiple incompatible versions of each internal library or framework, or each external library that's in use, multiplies the work involved in upgrading dependencies, and in other various ways leads to its own problems. There's no panacea but I can see the appeal of having a single shared codebase.
Still, much easier to manage with branches than before git and mercurial were around.
It's not a question of managing source branches, it's a question of whether you have to update your OpenSSL dependency once for the whole company or a thousand times over for each individual software package.
I was super skeptical of it when I started at Google. But it just works, and along with the rigorous code review process and commitment to code health, it makes for clean code with lots of consistency and re-use.
I see the "sweeping change" argument put forward often, but I don't really understand how this can be the case.

There is no atomic deployment of a large distributed system - so even if you can check in related changes in different areas of a codebase, how do you release them?

Solving deployment is orthogonal to organizing your code. You can deploy multiple products which share the same codebase, and you can also deploy one product which draws from multiple codebases (say, from other gems or npm packages that you also maintain). Regardless of where your code goes, handling distributed releases generally requires agreed upon contracts (e.g. schemas) and proper dependency management during rollouts.
> For instance, I wonder how Facebook manages its open source releases relative to its shared Mercurial codebase.

It sanitizes and syncs them to GitHub

https://code.facebook.com/posts/1715560542066337/automatical...