Code line count tends to grow exponentially. The bigger the code base, the more unreasonable it is to expect people not to reinvent an existing wheel, due to ignorance of the code or fear of breaking what exists by altering it to handle your use case (ignorance of the uses of the code).
IME it takes less time to go from 100 modules to 200 than it takes to go from 50 to 100.
Can’t we split the packages into logical groups and maybe have 20 or 30 monorepos of 70-100 packages? I doubt that all the devs involved in that monorepo have to deal with all the 2500 packages. And I doubt that there is a circular dependency that requires all of these packages to be managed in a single monorepo.
People act like managing lots of git repos is hard, then run into monorepo problems requiring them to fix esoteric bugs in C that have been in git for a decade, all while still arguing monorepos are easy and great and managing multiple repos is complicated and hard.
It's like hammering a nail through your hand, and then buying a different hammer with a softer handle to make it hurt less.
> all while still arguing monorepos are easy and great
I don't know anyone who says monorepos are easy.
To the contrary, the tooling is precisely the hard part.
But the point is that the difficulty of the tooling is a lot less than the difficulty of managing compatibility conflicts between tons of separate repos.
Each esoteric bug in C only needs to be fixed once. Whereas your version compatibility conflict this week is going to be followed by another one next week.
And the tooling to handle this is not even particularly conceptually complicated - a "versionset" is a set of versions - a set of pointers to a particular commit of a repository. When you build and deploy an application, what you're building is a versionset containing the correct versions of all its dependencies. And pull requests can span across multiple repositories.
Working at Amazon had its annoyances, but dependency management across repos was not one of them.
> And pull requests can span across multiple repositories
This bit is doing a lot of work here.
How do you make commits atomic? Is there a central commit queue? Do you run the tests of every dependent repo? How do you track cross-repo dependencies to do that? Is there a central database? How do you manage rollbacks?
Thad exactly the problem. At least tooling can solve mono repo problems.
But commits , which should span multiple repos, have no tooling at all. Except pain. Lots of pain.
Changing 100 CI pipelines is a giant pain in the ass. The third time I split the work with two other people. The 4th time someone wrote a tool and switched to a config file in the repo. 2500 is nuts. How do you even track red builds?
When you have hundreds of developers you’re going to get millions of lines of code. Thats partly Parkinson’s Law but also we have not fully perfected the three way merge, encouraging devs spread out more than intrinsically necessary in order to avoid tripping over each other.
If you really dig down into why we code the way we do, the “best practices” in software development, about half of them are heavily influenced by merge conflict, if not the primary cause.
If I group like functions together in a large file, then I (probably) won’t conflict with another person doing an unrelated ticket that touches the same file. But if we both add new functions at the bottom of the file, we’ll conflict. As long as one of us does the right thing everything is fine.
IME it takes less time to go from 100 modules to 200 than it takes to go from 50 to 100.