Hacker News new | ask | show | jobs
by erulabs 2723 days ago
If a mono-repo has a terabyte of code, or if 10 small repos have 1/10th a terabyte each, what have you really gained? In any case, git LFS solves large file storage effectively, as do a number of other artifact storage solutions, and a repo with a terabyte of code is _not_ going to be trivially split apart, since it would be by a factor of thousands, the biggest codebase ever created by humankind.
2 comments

If I only need to check out one of the smaller repos then I've gained quite a lot in terms of download speed, storage size, etc. Git LFS adds a lot of complexity I'd rather avoid.
Sure but then you only have some small portion of the total infrastructure, which adds its own layer of complexity for the people reviewing your changes :P It's all trade offs, is all I'm saying - I honestly still can't decide between the two, although for all companies sub 20 people, I'd for sure stick with a single repo.
If I'm working on Application X, wtf do I care about infrastructure code? Or for that matter, as a specific... if someone is working on Google Maps, should they care about the codebase for Google Inbox for Android?
You maybe relying on shared component for your app, you simply put in your BUILD bazel (blaze) file deps reference to it - e.g. "//base:something", but now that "//base:something" might itself rely on other deps, but that should not be of your concern.

So - what's stopping you from depending (using) anything else? Or how to stop you from doing this? BAZEL (blaze) has visiblity rules, which by default are private - e.g. the rules in your packages are hidden, unless explicitly made public, or alternatively you can white-list one by one which other packages (//java/com/google/blah/myapp) can include you back.

Let's say there is a new cool service, and your team wants to try it out... but it's not out there for everyone to use, it's in alpha, beta, whatever stage. So you ask for permission from the team, or simply create a CL with your package target, name, "..." folder resolution so that you are whitelisted - eventually you will (if that's good idea, and approved). For example you want, if some library got deprecated, and has been slowly replaced with another, and then now instead of being "//visibility:public" is just white listing the last users of it... Well probably not good idea to be added on that list, as the whole thing is going out soon (yes, Google tends to deprecate internally even faster than externally - ... which is good!). But such mechanisms are helpful in getting this worked correctly.

Does Application X rely on particular infrastructure configuration? Or does Google Inbox on Android integrate with Google Maps?

There are dependencies everywhere. Monorepos are one of the tools which can be used to make dealing with them easier in some cases. They’re not an absolute solution not appropriate for all circumstances, but no tool is!

> If a mono-repo has a terabyte of code, or if 10 small repos have 1/10th a terabyte each, what have you really gained?

If it's a small company where every developer touches every part of the application, sure. Taking the FAANG approach if you're not part of that acronym sounds like introducing inefficiency.

If it's a "small" company then I'd expect that one Git repo would do just fine for all or at least most of the code. When I think small, I think ~10 or 20 developers. If you have reasonable hygiene about things like keeping binaries out of your Git repo (excluding consideration of e.g. LFS here) then the whole repo size will stay fairly reasonable. As long as you have one or two Git mavens on your team it should be dandy.

I'd expect to see problems with this approach once you get into the 100s or 1000s of developers. The tooling for this scale of repository isn't as mature.

Sorry, what am I missing? That's exactly what I was saying - this stops making sense anywhere in between "small" and "the big boys"
> Taking the FAANG approach if you're not part of that acronym sounds like introducing inefficiency.

Is this not saying that small companies should avoid monorepos?

Specifically excluded in the preceding sentence in my post.