|
|
|
|
|
by sigil
2722 days ago
|
|
Thanks for the response. Out of curiosity, how does your engineering organization introduce new dependencies within the monorepo? Can B, C and D all depend on A without A's consent or even awareness? (Suppose A is some checked in code that's useful, going to see updates in future, but is dormant at present.) Your post puts a lot of the onus on A for breaking B, C, and D, but I think equal care and consideration needs to come from the other side of the contract. Eg, What are you depending on? Is it a dependency you want to take on, or are you and the shared code likely to diverge in life? These are top of mind decisions in a polyrepo architecture, but from my experience they're often not even considered in a monorepo. Anything checked in is fair game for reuse. This is why I suspect you may be "forcing" the wrong thing. For reference I've worked in companies large and small, both monorepo and polyrepo. When I worked on Windows back in the 00's the monorepo tooling (SourceDepot) was quite amazing for the time, but the costs of that sort of coordination were also painfully apparent to everyone. The place I currently work has a monorepo for desktop software and polyrepos for everything else. It isn't a straight up A/B experiment, but anecdotally the pain is higher and shipping velocity lower in the monorepo half of the world. Most of the monorepo pain is related to CI or other costs of global coordination, the kind of things Matt touches on midway (albeit probably too subtlely). I'd be interested to see your counterarguments to those points as well. Do you need fancy dependency management tooling to make your global CI builds fast and reproducible? Matt argues those end up being equivalent to the kind of dependency tooling that's intrinsic to polyrepo architectures anyway. |
|
Equal care does need to come from the other side of the contract. Most frequently, I see teams B, C, and D in a polyrepo world do the worst of all worlds: take dependencies liberally, pin them in place, and try to forget about them. Of course, high functioning engineering teams (and cultures) will try and avoid this: they will be thoughtful about dependencies, and they will keep them up to date. In practice, they most frequently do not. This is especially true in the enterprise broadly. When we get it wrong, and take a dependency we wish we hadn't, how do we know? When do we know? What is our recourse? If I depend on code in the monorepo that diverges, I'm more likely to know near to the point of divergence (because of the nature of the system). That means the conversation about how to fix it happens sooner. I'm not interested in avoiding error - that's going to happen. I'm interested in how close to the introduction of the error do we understand it, and how do we communicate about its remediation.
As far as CI and global coordination goes, the cost is high in either direction if the system is distributed, and the solutions are similar in my experience. I think the worst case is the mixed one (which is a world I inhabit) - you wind up splitting your investment in both style and effort across both approaches. With the monorepo style, one big advantage is where the complex CI interactions can be encoded, since you have access to more of the code itself. Granted, at scale, you likely are testing against artifacts rather than point in time commits outside of the component in question (this is very similar to what you're going to do in a polyrepo, too.)
I think solid testing design requires real effort and understanding of the system under test, regardless of repository layout. Which brings us back to communications again. The more you can see, and the more clearly experienced the pain is across the teams, the more likely you are to have the critical conversations needed to improve the system - rather than making local fixes ("my teams tests are fast", "their component sucks").