Hacker News new | ask | show | jobs
by CharlieDigital 594 days ago
It doesn't have to be that way.

You can have a mono-repo and deploy different parts of the repo as different services.

You can have a mono-repo with a React SPA and a backend service in Go. If you fix some UI bug with a button in the React SPA, why would you also deploy the backend?

4 comments

This is spot on. A monorepo can still include a granular and standardized CI configuration across code paths. Nothing about monorepo forces you to perform a singular deployment.

The gains provided by moving from polyrepo to monorepo are immense.

Developer access control is the only thing I can think to justify polyrepo.

I'm curious if and how others who see the advantages of monorepo have justified polyrepo in spite of that.

> If you fix some UI bug with a button in the React SPA, why would you also deploy the backend?

Why would you bother to spend the time figuring out whether or not it needs to get deployed? Why would you spend time training other (and new) people to be able to figure that out? Why even take on the risk of someone making a mistake?

If you make your deployment fast and seamless, who cares? Deploy everything every time. It eliminates a whole category of potential mistakes and troubleshooting paths, and it exercises the deployment process (so when you need it, you know it works).

It's literally automatic by configuring your workflows by file changes.

   > If you make your deployment fast and seamless, who cares? 
It still costs CI/CD time and depending on the target platform, there are foundational limitations on how fast you can deploy. Fixing a button in a React SPA and deploying that to S3 and CloudFront if fast. Building and deploying a Go backend container -- that didn't even change -- to ECS is at least a 4-5 minute affair.
I get that, and like anything it's a matter of trade-offs.

To me, in most cases, 4 or even 10 minutes is just not a big deal. It's fire-and-forget, and there should be a bunch of things in place to prevent bad deploys and ultimately let the team know if something gets messed up.

When there's an outage, multiple people get involved and it can easily hit 10+ hours of time spent. If adding a few minutes to the deploy time prevents an outage or two, that's worth it IMHO.

If you don't deploy in tandem, you need to test forwards & backwards compatibility. That's tough with either a monorepo or separate repos, but arguably it'd be simple with separate repos.
It doesn't have to be that complicated.

All you need to know is "does changing this code affect that code".

In the example I've given -- a React SPA and Go backend -- let's assume that there's a gRPC binding originating from the backend. How do we know that we also need to deploy the SPA? Updating the schema would cause generation of a new client + model in the SPA. Now you know that you need to deploy both and this can be done simply by detecting roots for modified files.

You can scale this. If that gRPC change affected some other web extension project, apply the same basic principle: detect that a file changed under this root -> trigger the workflow that rebuilds, tests, and deploys from this root.

You wouldn't, but making a repo collection into a mono-repo means your mono-deploy needs to be split into a multi-maybe-deploy.

As always, complexity merely moves around when squeezed, and making commits/PRs easier means something else, somewhere else gets less easy.

It is something that can be made better of course, having your CI and CD be a bit smarter and more modular means you can now do selective builds based on what was actually changed, and selective releases based on what you actually want to release (not merely what was in the repo at a commit, or whatever was built).

But all of that needs to be constructed too, just merging some repos into one doesn't do that.

This is not very complex at all.

I linked an example below. Most CI/CD, like GitHub Actions[0], can easily be configured to trigger on changes for files in a specific path.

As a very basic starting point, you only need to set up simple rules to detect which monorepo roots changed.

[0] https://docs.github.com/en/actions/writing-workflows/workflo...

It's not complex if it's just a 'dump everything in one place'-repository, but the concept of a mono repository is that separate applications that share code can share it by being in the same commit stream.

That's where it immediately gets problematic; say you have a client and a server and a shared library that does your protocol or client/server code for you. If you wanted to bump the server part and check if it's backwards compatible with say, the clients that you have already deployed, your basic "match this stuff" pattern doesn't work because it now matches either nothing, because it just builds the library and not the client or the server, or it matches all of it, because you didn't make a distinction. And when you just want the new server with the new library and nothing else, you somehow have to build and emit the new library first, reference that specific version in your code, rebuild your code, and meanwhile not affect the client. That also means when someone else is busy working on the client or on the older library, you have to essentially all be on different branches that cannot integrate with each other because they would overwrite or mess up the work of the people that are busy with their own part.

You could go monorail and do something where every side effect of the thing you touches automatically also becomes your responsibility to fix, but most companies aren't at a scale to pay for that. That also applies to things like Bazel. You could use it to track dependent builds and also tag specific inter-commit dependencies so your server might build with the HEAD source library and your client would still be on an older commit. But that that point you have just done version/reference pinning, with extra steps. You also can't see that at the same time in your editor, unless you open two views in to the same source, at which point you might as well open two views, one for the library and one for the server.

If your project, organisation of people (teams) and dependencies are simple enough, and you only deploy one version at a time, of everything, then yes, doing multiple directories in a single repo or doing multiple repos is not much of a change. But you also gain practically nothing.

There are probably some more options as I don't doubt it is more of a gradient; the Elastic sources seem to do this somewhat well, but they all release everything at once in lockstep. I suppose one way to take separate versions out of that is to always export the builds and have a completely separate CD configuration and workflow elsewhere. This appears to be what they have done as it's not public.

Unless that literally prevents you from accessing other parts of the repository, that creates a big risk that you depend on something from another folder and are not triggering rebuilds when you should.