Hacker News new | ask | show | jobs
by jmillikin 1230 days ago

  > The single version dependencies are asinine. We are migrating to
  > a monorepo at work, and someone bumped the version of an open
  > source JS package that introduced a regression.
There's no requirement to have single versions of dependencies in a monorepo. Google allows[0] multiple versions of third-party dependencies such as jQuery or MySQL, and internal code is expected to specify which version it depends on.

  > It encourages poor API contracts because it lets anyone import any
  > code in any service arbitrarily.
Not true at Google, and I would argue that if you have a repository that allows arbitrary cross-module dependencies then it's not really a monorepo. It's just an extremely large single-project repo with poor structure. The defining feature of a monorepo is that it contains multiple unrelated projects. At Google, this principle was so important that Blaze/Bazel has built-in support for controlling cross-package dependencies.

  > I see at least one PR every week [...] because some shared config
  > or code changed in another part of the repo and now the entire repo
  > needs to be migrated in lockstep for things to compile.
That really doesn't sound like a monorepo to me. If all the code has to be migrated "in lockstep", then that implies a single PR might change code across different parts of the company. At which point it's not independent projects in a monorepo, it's (merely) a single giant project.

[0] Or allowed -- I last worked there in 2017.

7 comments

I never worked at Google, but this post sums up everything I had to say about the matter. GP has a sh-tty monorepo experience at one company and decides to make a statement about another company where they never worked (so I presume). HN absurdism as its best!

I second your point about monorepo versus ball of mud. They are so different. And managing all of this is about social/culture, less science-y. If you don't have good culture around maintenance, well then, yeah, duh, it will fall apart pretty quickly. It sounds like Google spends crazy money to develop tools to enforce the culture. Hats off.

It generally gives the sense that mono-repo is actually irrelevant, and the more detailed processes across the whole experience are what matters.
Yep, the top comment in this thread is a fantastic example of typical HN comments. Naïveté masked as expertise.
There's always been a very strong one version policy, multiple versions are usually only allowed to coexist for weeks or months, and are usually visibility restricted.

This prevents situations where "Gmail" ends up bundling 4 different, mildly incompatible versions of MySQL or whatever, and the aggravation that would cause. Or worse, in c++ you get ODR violations due to a function being used from two versions of the same library.

I think the catch, is that it isn't just third-party dependencies that are of concern. In particular, at a certain size, you are best off treating every project in the company as a third party item. But, that is typically not what you are wanting with source dependencies.

You can see this some with how obnoxious Guava was, back in the day. It seems a sane strategy where you can deprecate things quickly by getting all callers to migrate. This is fantastic for the cases where it works. But, it is mind numbingly frustrating in the cases where it doesn't. Worse, it is the kind of work that burns out employees and causes them to not care about the product you are trying to make. "What did you do last month?" "I managed to roll out an upgrade that had no bearing on what we do."

There’s a policy against multiple versions of third party dependencies. Though there is a mechanism for exceptions.
Sounds sane, usually you don't want multiple versions just because people were too lazy to keep code up to date. But in some instances it's probably worth supporting several major versions across the whole org.
I guess the question then becomes: Is it worth all the extra tooling required to manage a monorepo properly?
There is a lot of extra tooling required to manage a large non-monorepo org too.
The answer of course is no, but since they have a stable search product that supplied unlimited money, they were able to stubbornly stick to that decision.
https://opensource.google/documentation/reference

The third party documentation is public, one-version policies exist but they are exemptions.

> There's no requirement to have single versions of dependencies in a monorepo. Google allows[0] multiple versions of third-party dependencies such as jQuery or MySQL, and internal code is expected to specify which version it depends on.

Sure, but this is unsustainable. If service Foo depends on myjslib v3.0.0, but service Bar needs to pull in myjslib v3.1.0, in order to make sure Foo is entirely unchanged, you'd have to add a new dependency @myjslib_v3_1_0 used only by Bar. After two years you'd have 10 unique dependencies for 10 versions of myjslib in the monorepo.

At this point you've basically replicated the dependency semantics of a multi-repo world to a monorepo, with extra cruft. This problem is already implicitly solved in a multi-repo world because each service simply declares its own dependencies.

  > Sure, but this is unsustainable. [...] After two years you'd have
  > 10 unique dependencies for 10 versions of myjslib in the monorepo.
This is a social problem, and needs to be solved by a dependency management policy. Your org might decide that the entire org is only allowed to use a single version of each third-party dependency (which IMO is harsh and unhelpful), or might have a deprecation period for older versions, or might have a team dedicated to upgrading third-party deps.

Note that this need for a policy exists for both mono-repo and multi-repo worlds. Handling of third-party dependencies ought to be independent of how the version control repository is structured.

  > At this point you've basically replicated the dependency semantics of
  > a multi-repo world to a monorepo, with extra cruft. This problem is
  > already implicitly solved in a multi-repo world because each service
  > simply declares its own dependencies.
The problem with the multi-repo solution is that there's no linear view of the changes. Each repo has its own independent commit graph, and questions like "does the currently deployed version of service X include dependency commit Y" become difficult or impossible to answer.

That's why monorepos exist. They're not a way to force people to upgrade dependencies, and they aren't a get-out-of-jail-free card for thinking about inter-project dependencies. A monorepo lets you have a linear view of code history.

Phrased differently: many people approach monorepos as a way to force their view of dependency management on other people in their organization. The successful users of monorepos (including Google) take great efforts to let separate projects in the same repo operate independently.

> Sure, but this is unsustainable. If service Foo depends on myjslib v3.0.0, but service Bar needs to pull in myjslib v3.1.0, in order to make sure Foo is entirely unchanged, you'd have to add a new dependency @myjslib_v3_1_0 used only by Bar. After two years you'd have 10 unique dependencies for 10 versions of myjslib in the monorepo.

you're imagining a situation and speculating up a problem that might occur in that imaginary situation. in reality, no one does the thing you said - you don't add random deps on random external javascript libraries that don't have sane versioning stories.

> Sure, but this is unsustainable.

Not exactly unsustainable considering Google has been very successful with this approach!

I suspect Google spends more on developer tooling than any organization of on the planet. Probably worth considering that whenever trying to see whether something would work for you.
Is this a good comparison? Not everyone is Google-size. In fact, very few businesses are. What is managable for Google or a good practice for Google, might be unsustainable for another business.
I think the lesson to draw from bigorgs isn't what to as a smallorg, but what directions you can grow and what the pitfalls are on those roads.

Any smallorg probably wants a bare monorepo, git or what have you. If you grow to the point that becomes unwieldy, you can either invest in tooling the way Google has, or be prepared to split the repo into library and project repos in a way that makes sense for what your mediumorg has grown into.

What is a "bare monorepo" in contrast to simply a "monorepo"?
Google is a monorepo with sixteen layers of tooling to make it searchable, to not require you to spent 3 days making a local copy before editing, to manage permissions across orgs, etc etc etc.

A small organization of 1-20 people should not emulate the layers of tooling; just have a single git repo somewhere and call it a day.

Google isn't successful because of their tech decisions. They just happen to make an infinite amount of ad money; everything else they do is mainly getting their engineers to play in sandboxes to distract them so they won't leave to start other ad markets. It works though, since everyone is in love with their complex makework ideas.
I want to think they have. But... this is also why they kill older products. The cost of keeping the lights on is greatly elevated when keeping the lights on means keeping up with the latest codes.

This is absolutely no different from buildings. If you had to keep every building up to date with the latest building codes, you would tear them down way way way more often.

> this is also why they kill older products. The cost of keeping the lights on is greatly elevated when keeping the lights on means keeping up with the latest codes.

This is a really good point and I think accurate when it comes to smaller Google endeavors. I don't think this killed Stadia, for example, but maybe Google Trips (an amazing service that I don't think many folks used and likely had few development resources assigned, or none).

Yeah, I would not mean this to include Stadia. That said, if it adds costs to the smaller Trips and such, it has to add cost to the larger things, too. That is, if it makes the cheap things expensive, it probably makes the expensive things even more so.
So why is this a problem for a monorepo but not the multi-repo? It seems to me that the major difference is that in a multi-repo, you'd be more likely to be oblivious to the multitude of dependency issues than you are in the monorepo... and to be honest, that actually sounds like a bad thing to me, because it means you're sweeping legitimate issues underneath the rug.
In a multi-repo, you don't build source dependencies between projects.

You can do this with a mono, as well. However, the conceit is that "in the same repo" means you can "change them together." It is very very tempting that "went out as a single commit" means that it went out fine. Which, just isn't something you see in a multi world.

  > However, the conceit is that "in the same repo" means you can
  > "change them together."
In a monorepo you shouldn't be making changes to independent components in a single commit. That's how you end up being forced to roll back your change because you broke someone else's service.

If you're making a backwards-incompatible change to an API then you need to:

1. Make a commit to your library to add the new functionality,

2. Send separate commits for review by other teams to update their projects' code,

3. Wait for them to be approved and merged in, then merge a final cleanup commit.

If your repository is designed to enable a single commit to touch multiple independent projects then it's not a monorepo, it's just a single-project repo with unclear API and ownership boundaries.

This is clearly correct. But even in a multi world, I've seen far more attempts at atomic commits than makes sense.

I'd love for it to be a strawman. But I do keep finding them.

You do get me to question what a mono repo is. I've never seen one that wasn't essentially an attempt at treating a company as a large project. Akin to a modular codebase with a single build. Could be a complicated build, mind you. Still, the goal has always been a full repository build.

> In a multi-repo, you don't build source dependencies between projects.

In my experience with software projects, this is very much not the case. It's one of the main reasons I'm such a big fan of monorepos--I have been burned way too many times by the need to make atomic commits involving separate repositories.

If you have multiple repos, you can't have an atomic commit between them. Pretty much period. I'm scared to hear what you mean on that.

Ideally, all tooling makes the separate nature of the projects transparent. They should test separately. They should deploy separately. If that is not the case, then yes, they should be in the same repo.

I've worked at places where they would "solve" this problem by letting the build break while all the commit in various repositories land all at once. It's really bad.