Hacker News new | ask | show | jobs
by sigil 2722 days ago
Thanks for the response. Out of curiosity, how does your engineering organization introduce new dependencies within the monorepo? Can B, C and D all depend on A without A's consent or even awareness? (Suppose A is some checked in code that's useful, going to see updates in future, but is dormant at present.)

Your post puts a lot of the onus on A for breaking B, C, and D, but I think equal care and consideration needs to come from the other side of the contract. Eg, What are you depending on? Is it a dependency you want to take on, or are you and the shared code likely to diverge in life? These are top of mind decisions in a polyrepo architecture, but from my experience they're often not even considered in a monorepo. Anything checked in is fair game for reuse. This is why I suspect you may be "forcing" the wrong thing.

For reference I've worked in companies large and small, both monorepo and polyrepo. When I worked on Windows back in the 00's the monorepo tooling (SourceDepot) was quite amazing for the time, but the costs of that sort of coordination were also painfully apparent to everyone.

The place I currently work has a monorepo for desktop software and polyrepos for everything else. It isn't a straight up A/B experiment, but anecdotally the pain is higher and shipping velocity lower in the monorepo half of the world. Most of the monorepo pain is related to CI or other costs of global coordination, the kind of things Matt touches on midway (albeit probably too subtlely). I'd be interested to see your counterarguments to those points as well. Do you need fancy dependency management tooling to make your global CI builds fast and reproducible? Matt argues those end up being equivalent to the kind of dependency tooling that's intrinsic to polyrepo architectures anyway.

2 comments

Disclaimer: it depends. :) Since that's not a good answer at all, I'm going to write the rest of this as if I have the answer, even though I know I do not, because it's deeply situational.

Equal care does need to come from the other side of the contract. Most frequently, I see teams B, C, and D in a polyrepo world do the worst of all worlds: take dependencies liberally, pin them in place, and try to forget about them. Of course, high functioning engineering teams (and cultures) will try and avoid this: they will be thoughtful about dependencies, and they will keep them up to date. In practice, they most frequently do not. This is especially true in the enterprise broadly. When we get it wrong, and take a dependency we wish we hadn't, how do we know? When do we know? What is our recourse? If I depend on code in the monorepo that diverges, I'm more likely to know near to the point of divergence (because of the nature of the system). That means the conversation about how to fix it happens sooner. I'm not interested in avoiding error - that's going to happen. I'm interested in how close to the introduction of the error do we understand it, and how do we communicate about its remediation.

As far as CI and global coordination goes, the cost is high in either direction if the system is distributed, and the solutions are similar in my experience. I think the worst case is the mixed one (which is a world I inhabit) - you wind up splitting your investment in both style and effort across both approaches. With the monorepo style, one big advantage is where the complex CI interactions can be encoded, since you have access to more of the code itself. Granted, at scale, you likely are testing against artifacts rather than point in time commits outside of the component in question (this is very similar to what you're going to do in a polyrepo, too.)

I think solid testing design requires real effort and understanding of the system under test, regardless of repository layout. Which brings us back to communications again. The more you can see, and the more clearly experienced the pain is across the teams, the more likely you are to have the critical conversations needed to improve the system - rather than making local fixes ("my teams tests are fast", "their component sucks").

Most frequently, I see teams B, C, and D in a polyrepo world do the worst of all worlds: take dependencies liberally, pin them in place, and try to forget about them.

This has been my observation as well, minus the value judgment. Why is pinning dependencies and moving on with life the worst thing in the world? As you point out in your article, a security fix in A does suddenly force B, C, and D’s hand. Another scenario I’ll add to that: if A provides communication between B, C and D, a synchronized update to all dependents might be required.

Thing is, I’d argue these scenarios are the exception to the rule. If you’re drawing boundaries in the right places (again this may come back to contract design) you’re largely free to change implementation details when you need to, on your own terms, and not because some distant transitive dependency has decided it’s time for your build to break.

With monorepos I see lots of the latter. Lots of breakage for no other reason than “everyone needs to be on the same page.” Lots of conversations — O(N^2) conversations, times some constant factor — that might not need to take place, ever, but it’s critical the entire company have them right now because the global build is broken.

Here’s another way of looking at it. Until a few years ago, it was standard practice to frequently update npm dependencies against fuzzy semvers. Now most people pin their dependencies, and their dependencies’ dependencies, with a lockfile. And in other ecosystems like go’s you also have tooling to support much more controlled, infrequent and minimal dependency upgrades (see MVS).

Why the change? Because people got tired of things breaking all the time. They wanted off the treadmill so they could Get Things Done again. I don’t see how monorepos provide this stability, and frankly it seems like the monorepo idea is where npm was about 5 years ago. Perhaps even farther behind than that, since C, C++ and others haven’t even evolved viable language package managers yet.

You’re a rust fan, so maybe cargo + a monorepo is a sweet spot I haven’t encountered yet? Anyway, I do really appreciate you taking the time to share your perspective on these things. It’s been great having a reasonable discussion about them.

If you've got a pre merge build check you can't break global build in a monorepo. That's the benefit, the one introducing the breakage will get a fail in your CI. There is no need for other teams to catch up.

By doing this you only ever "step" a dependency one at a time and one minor minor version at a time, so you only get very few and very small breakages each time. Instead of locking your depfile and then 6 months down the road you realize you need a security fix in component foo but then you got 1000 other backwards incompatible changes to fix because of transitive dependencies that also need to be upgraded in order to satisfy foo 1.2 dependencies.

I think we agree (and it's probably self evident) that it's hard but necessary work to try and get the boundaries right, and it requires a lot of refactoring before things stabilize. In the early stages of work, that refactoring is frequent and often deep. Later, it (usually) becomes infrequent and shallow.

I think it's important to separate internal dependencies from external ones. My personal advice is to treat external dependencies in whatever way the language prefers, and upgrade on a cadence. This is because you can't have any real impact on your external dependencies - even if they are critical, you can essentially treat them as a black box for terms of this conversation. For the rest of my response, lets assume we're talking internal dependencies.

The thing about breakage "for no reason" is that you are still broken, you just don't know it yet. One assumes the team that broke you had a reason. It might be a good or bad reason, from your point of view, but it wasn't no reason. When I talk about forcing the conversation, this is why. It's not better to hide from the changes, or pretend that you are safe. You aren't. All that happens is you move the time between when the breakage was introduced, and when you discover it. Most frequently, that discovery happens when the upgrade becomes critical (security) - and the time to apply the change has gotten longer, and the team who made the breaking changes no longer remembers clearly the drift. This makes teams even more less likely to move.

By ensuring these types of changes hurt, and are understood to be a shared responsibility (the consumer has a responsibility to move, the producer has a responsibility to understand and protect the stability of their consumers), teams have the impetus to design and build systems that ensure their stability. It's one thing to ask for things like circuit breakers, backwards compatible interfaces, etc. It's all theoretical from a single engineers, or single teams, point of view. It's not a panacea, but when the contract is structured this way, everyone adapts to the issue: producers get more defensive, consumers get less debt.

Like I say in the original, I think this comes down to perspective. When my concern was primarily the efficiency of a single team, who was small enough to stay connected through conversation and shared understanding, it matters way less.

A lot of your reply comes from the perspective of wanting, as an engineer, to just Get Things Done again. I get it, and I'm sympathetic. It is harder to work this way, because you can't take the easy shortcuts (pinning, delaying the upgrade, ignoring your consumers, etc.) - but that's precisely the point. Those things are bad in the long term.

Actually Windows wasn’t a monorepo back then: there were separate repos for the shell, kernel, filesystem, etc. Hence the need for cross-repo tooling like “sdx”.

Source Depot was great (modulo availability issues), but I don’t think they got anywhere near the scale of Piper.

Actually Windows wasn’t a monorepo back then: there were separate repos for the shell, kernel, filesystem, etc. Hence the need for cross-repo tooling like “sdx”.

This is a bit misleading to outsiders. Each of these repos was huge for the time, corresponded to a major subsystem with many disparate components, and the default tooling on the ground was the cross-repo tooling. One got the impression that if they could have pulled off one giant monorepo to rule them all, they would have, but they fell just short due to some technical details (cough spinning magnetic disks). In the meantime `sdx` was a convenient abstraction that allowed people to work in a monorepo way.

All in all it wasn't so different from present-day monorepos broken into git submodules for performance reasons.