Hacker News new | ask | show | jobs
by Jyaif 3317 days ago
pros to big repo:

-dont have to spend time to think about defining interfaces

cons:

-history is full of crap you dont care about

-tests take forever to run

-tooling breaks down completely, though thanks to MS the limit was increased seriously

4 comments

Are the big monorepo companies actually waiting for global test suite completion for every change? I'd doubt that, I'm sure they're using intelligent tools to figure out what tests to actually run. Compute for testing is massively expensive at that scale so it's an obvious place to optimize
Google's build and testing system is smart in which tests to run, as you suspect, but it still has a very, very large footprint.
Right. My point is that the monorepo almost certainly isn't a problem in this regard.
You still have to do something about internal interfaces. The problem is that the moment you want to make a backwards-incompatible change to an internal interface now you have to go find users of it, and there go the benefits of GVFS... Or you can let the build and test system tell you what breaks (take a long coffee break, repeat as many times as it takes; could be many times). Or use something like OpenGrok to find all those uses in last night's index of the source.

Defining what portions of the OS you'll have to look in for such changes helps a great deal.

As to building and testing... the system has to get much better about detecting which tests will need to be re-run for any particular change. That's difficult, but you can get 95% of the way there easily enough.

-dont have to spend time to think about defining interfaces

That seems like a design and policy choice, orthogonal to repos.

Not really. It's easier to make a single atomic breaking change to how different components talk to each other if they are in the same repository.

If they are in different repos, the change is not atomic and you need to version interfaces or keep backwards compatibility in some other way.

It's very much really. The fact that it's easier doesn't really matter - a repo is about access to the source code and its history with some degree of convenience. The process and policy of how you control actual change is quite orthogonal. You can have a single repo and enforce inter-module interfaces very strongly. You can have 20 repos and not enforce them at all. Same goes for builds, tests, history, etc. The underlying technology can influence the process but it doesn't make it.
I have always wondered how they deal with acquisitions and sales. I guess a single system makes sense there too.