Hacker News new | ask | show | jobs
by psv1 2470 days ago
Not a TypeScript user, but what really stood out to me is that Google are using a monorepo.
6 comments

Be careful not to mistake Google's use of a monorepo with consideration of whether $DAYJOB or $FOSSPROJECT should use a monorepo.

Google has a lot of tooling and some very thoroughly-considered and reinforced policies and cultures around their use of a monorepo. Trying to use a monorepo without those tools and ingrained policies may not be very likely to lead to similar results.

The default should be keeping code together, and justifying why you are splitting it up, not the other way around.

While what you said is true, the codebase of most organisations is not large enough that they run into those scale constraints for a long time. Instead, if you split up your code you immediately get organisational headaches of managing changes across multiple codebases. If you keep it together, the scale challenges can be managed later, if they occur (c.f. YAGNI).

If you are splitting, codebases should be split across organisational boundaries, not technical ones. If you look at the FOSS world, the obvious conclusion is that each library is in its own repo, and that is partly true. In practice what is happening is that each OSS project is its own organisational team, and so it lives together. If you have parts of your company on different continents working on different things (and maybe you need different access) then splitting up may make sense. Otherwise, you're probably just making your life harder for little to no gain.

I think I'd more or less agree with that.

For example, one of the questions I generally ask in a meeting room that's considering this topic, and has for example microservices in flight, is: "Are you willing to put in the work to make Service A support more than one version at a time in Service B?", and if not, then that's a very (very) strong indicator that the level of coupling and lack of organizational boundaries will result in pain from anything but a monorepo.

I prefer to split things up when possible. When it does fit, there's many benefits. There's also the concern that building too much culture and tooling that presumes a lack of splitting of repos can become its own form of trap which becomes increasingly hard to navigate out of, even if you later want to, as the situation self-iterates. But I agree that splitting things out needs justification, and the "default" stance should include that.

Another interesting bit with the monorepo is the one-version-policy:

https://opensource.google.com/docs/thirdparty/oneversion/

That's why the typescript upgrade was so hard for them. We (attempt) to enforce a single version of a library/toolchain to be checked into the codebase at any given time. You can have multiple in during an upgrade, but it's highly discouraged.

This is also why Google says to test everything; even minor version upgrades can have unexpected behavioral changes. Without tests, these might break your project without warning.
On the contrary, having multiple-versions could have made the upgrade much worse, by deferring compatibility problems from submit time to deploy time.
It's a trade-off. As a user of a third-party library, I would like to upgrade it to get new functionality. But there are breaking changes in the update.

So to upgrade I would have to fix all users of the library to upgrade. While this is better overall for the codebase, it can put a lot of work on others for a not well maintained third-party library. Something like TS has people that help keep it updated. But for something more obscure, it'll be on someone else who cares enough to put in the work.

Google is pretty notorious for this. It is one of the reasons behind the old golang GOPATH setup, and one of the reasons it took so long for the Go taking so long to get modules.
I'm not sure that's true. The layout of Go code in Google's monorepo is not at all similar to GOPATH, and patterns that are common within google3 (such as multiple Go packages in one directory) are fundamentally incompatible with the Go build system. As an ex-Googler, I still strongly prefer the google3 style and am annoyed when open-source Go tooling can't deal with it properly.
what is google3 style?
I think it's the mentioned

> multiple Go packages in one directory

as opposed to having a 1:1 package:dir mapping

How does that work on a nodejs project? I understand that they only have one version of the TypeScript compiler for all projects inside the monorepo, so that means there's only one huge package.json inside with all the packages used by every project inside the repo?
The lingua franca for Google's building needs is (more or less) bazel, where you say target /a/b/c depends on /dep/v1_1, /dep/xyz, /common/foo, etc.). (There is a filesystem-like hierarchy parallel to, but not necessarily the same as the corresponding repository's directory layout.)

Bazel is extensible via rules [1], so if you really wanted to use NodeJS on your team, you might create a `nodejs_binary` rule that put everything in the right directory and ran some NodeJS packager on it. You'd probably not put it into production.

Also, third-party code lives in a single third-party directory, so yes, internal users could pull down code they wanted (and for which there wasn't a satisfactory internal version already) into that directory: https://opensource.google.com/docs/thirdparty/

[1]: https://docs.bazel.build/versions/0.29.0/skylark/rules.html

that stood out to me too... not the monorepo really, but the fact its a monorepo of a billion lines of code. seems almost impossible to maintain when you have a dependency change at the lowest levels that affects numerous projects.If i am understanding google was in a situation where they had to update every project that used typescript inside the entire company at the same time, that seems untenable.
Whether it's a monorepo or not doesn't change the fact that you have a billion lines of code to maintain, but at least this way they're forced to make changes consistently. The alternative would be doing it piecemeal with a dozen versions of dependencies propagating due to fragmentation, which is way way worse.
Here's an article from a maintainer of a framework library (Angular): https://medium.com/@Jakeherringbone/you-too-can-love-the-mon...

Something like this might only be possible due to their tooling and test coverage. So when you change something, you immediately get alerted of broken tests.

Mostly it means that you have to be really thoughtful when introducing a breaking change to a low level library.
Or you can YOLO / Leroy Jenkins the change and let everyone else fix the breakage you create, or see if they demand a rollback.
They will not only demand a rollback but also do it. Besides, low level changes at this scale require a copious amount of approvals.
Google is using the monorepo ;-)