Hacker News new | ask | show | jobs
by shepwalker 1448 days ago
yeaaah, I used to hold this view and drove a team - hard - to essentially pull apart a monorepo into different components with clear contract boundaries. It was by far one of my greatest errors in professional judgement.

at the risk of coming up with a contrived example: let's say you own a service that need to deserialize a datetime in a request in a format you don't currently support. assuming you own the stack, you need to a) update your date library b) update your webserver stack c) possibly update an intermediate webserver stack that includes primitives like logging, telemetry, tracing, auth, service discovery and d) your actual service.

If a->d are all independent, separate components, you have to orchestrate those changes through 4 separate repositories. And god forbid something you did at the lowest point in the stack is completely unworkable higher up.

There's all sorts of rocket science you could do to orchestrate these changes, but it ends up being contrived and edgecasey.

Most of the pain from monorepos can also be addressed with a dash of rocketscience (see:bazel), but the end model tends to have

a) have an easier mental model for the user

b) allow for consolidation of infrastructure work. Ie, your build/ci/language tooling teams can focus their efforts on one place

c) can coordinate changes across the entire stack within one field of view

d) can coral some of the worst, disparate instincts of a growing engineering org (ie, tons of teams optimizing for local maximas without internalizing knock-on effects).

e) fewer weird edgecases.

1 comments

Its important to avoid deep dependency chains when doing this, yes. I usually recommend a "framework" multi-package repo and then multiple "application" / "service" repos.

I like having examples, even contrived ones, but I'm not sure I understood this one. Can you elaborate on what you mean? Is it about adding support for a new serialization format for dates in requests to a service? Why would this affect the webserver stack and logging/telemetry/tracing/auth primitives?

I find that a lot of organizations have really strange thoughts on how to factor things into separate microservices and libraries. Usually I approach by asking the following question: if this was an open source-library or service (e.g. like elasticsearch), would you use it? If not, then its probably not a great candidate for a separate thing - lets try and come up with something else.

One way to handle CI/CD is using standardized pipelines e.g. you tag your repo with a tag `app:node` or `lib:js` and the github org pipeline scanner will find it and assign the standard `app:node` or `lib:js` pipeline to it.

A way that I like better but most tools unfortunatley don't support it yet is for the infra teams to publish libraries that are essentially functions taking some parameters and generating (standard) pipelines/configuration. Those can then be tracked together the same as other dependencies.