Hacker News new | ask | show | jobs
by threethirtytwo 134 days ago
It’s true. But static proof based checks can happen across system boundaries it’s just for historical and stupid reasons we misapply it here.

First put all your services in a monorepo have it all build as one under CI. That’s a static check across the entire system. When you do a polyrepo you are opening up a mathematical hole where two repos can be mismatched.

Second don’t use databases. Databases don’t type check and aren’t compatible with your functional code written on servers.

The first one is stupid… almost everyone just makes this mistake and most of the time unless you were there in the beginning to enforce monorepo you have to live with someone deploying a breaking change that effects a repo somehwere else. I’ve been in places where they simply didn’t understand the hole with polyrepos and they went with it despite me diagramming the issue on a white board (it’s that pervasive). Polyrepos exist for people who like organizing things with names and don’t understand static safety.

And it gets worse. So many people don’t understand that this problem only gets worse over time. The more repos you have in your polyrepo the more brittle everything becomes and they don’t even understand why.

The second one is also stupid but you had to be there in the very very beginning to have stopped it. The inception of the concept of databases should have been created such that any query you run on it needs to be compiled to run on the database and is thus statically checked. This is something that can easily be done but wasn’t done and now the entire World Wide Web just has to make do with testing. Hello? Type checking in application code but not in query code? Why?

That’s like half the problems with distributed systems completely solved. And these problems just exist because of raw stupidity. Stuff like race conditions and dead locks are solvable too but much harder. But these issues are obvious. What’s not obvious are the aforementioned problems that almost nobody talks about even though they are completely solvable.

4 comments

> First put all your services in a monorepo have it all build as one under CI. That’s a static check across the entire system.

There are definitely benefits to this approach. My coworkers do fall into the trap of assuming that all the services will be deployed simultaneously, which is not something you can really guarantee (especially if you have to roll one of them back at some point), so the monorepo approach gives them confidence in some breaking changes that it shouldn't (like adding a new required field).

I mean you talk about it as if it's a "benefit"

I'm talking about it in the same sense as the "benefit" of using typescript over javascript. Not just a "benefit" but it's the obvious path, the obvious better way.

Everything about monorepos and polyrepos are basically mostly just debates about opinions and styles and preferences. But most people don't understand... the monorepo is definitively better. Don't think of it as a benefit, this makes that style of error literally impossible to occur.

Well, no, it doesn't. A monorepo does nothing to prevent you from making breaking changes, it just stops you from making changes that don't compile/test. You still have to understand that services aren't deploying as an atomic unit and make sure that your network calls are forward and backward compatible.
I never said it stops you from making ALL breaking changes. But it makes a whole class of very common breaking changes Impossible to occur. This is a definitive benefit. Monorepo means much less errors, Polyrepo means more, every other difference between the two is a debatable opinion but this is definitive.

>You still have to understand that services aren't deploying as an atomic unit and make sure that your network calls are forward and backward compatible.

The time between inception of a deploy and the termination of a successful deploy isn't solved. But a monodeploy solves an entire class of errors outside the boundary of an atomic deploy. Think about what's in that boundary and what's outside of that boundary? How long does a deploy take? An hour? How long are you not deploying?

That's they key, static checking can't fix everything and a monodeploy isn't a full guarantee of safety, but it does guarantee the impossibility of a huge class of errors in the interim time between successful monodeploys.

Yeah, I think you're preaching to the choir about static checking, the only point I was making is that monorepo doesn't solve some classes of errors and that I've actually seen it generate false confidence in that realm.
Agreed. You're right it doesn't solve some classes of errors.

I guess my point is, monorepo vs. polyrepo... monorepo is the winner because it solves more classes of errors.

> First put all your services in a monorepo have it all build as one under CI. That’s a static check across the entire system.

That helps but is insufficient, since the set of concurrently deployed artifact versions can be different than any set of artifact versions seen by CI -- most obviously when two versions of the same artifact are actively deployed at the same time. It also appears to rule out the possibility of ever integrating with other systems (e.g., now you need to build your own Stripe-equivalent in a subdir of your monorepo).

> Second don’t use databases.

So you want to reimplement your own PostgreSQL-equivalent in another monorepo subdir too? I don't understand how opting not to use modern RDBMSes is practical. IIUC you're proposing implementing a DB using compiled queries that use the same types as the consuming application -- I can see the type-safety benefits, but this immediately restricts consumers to the same language or environment (e.g., JVM) that the DB was implemented in.

>That helps but is insufficient, since the set of concurrently deployed artifact versions can be different than any set of artifact versions seen by CI

Simple, although I only mentioned repos should be mono, I should've also said deployment should be mono as well. I thought that was a given.

>So you want to reimplement your own PostgreSQL-equivalent in another monorepo subdir too?

I'm too lazy to do this but in general I want an artifact that is built. All sql queries need to be compiled and built and the database runs that artifact. And of course all of this is part of a monorepo that is monodeployed.

>I don't understand how opting not to use modern RDBMSes is practical.

It's not practical. My point is there's some really stupid obvious flaws we live with because it's the only practical solution.

> Simple, although I only mentioned repos should be mono, I should've also said deployment should be mono as well. I thought that was a given.

Deploying your service graph as one atomic unit is not a given, and not necessarily even the best idea - you need to be able to roll back an individual service unless you have very small changes between versions, which means that even if they were rolled out atomically, you still run the risk of mixed versions sets.

>Deploying your service graph as one atomic unit is not a given,

It's not a given because you didn't make it a given.

>and not necessarily even the best idea - you need to be able to roll back an individual service unless you have very small changes between versions, which means that even if they were rolled out atomically, you still run the risk of mixed versions sets.

It is the best idea. This should be the standard. And nothing prevents you from rolling back an individual service. You can still do that. And you can still do individual deploys too. But these are just for patch ups.

When you roll back an individual service your entire system is no longer in a valid state. It's in an interim state of repair. You need to fix your changes in the monorepo and monodeploy again. A successful monodeploy ensures that the finished deploy is devoid of a common class of errors.

Monodeploy should be the gold standard, and individual deploys and roll backs are reserved for emergencies.

> It is the best idea. This should be the standard. And nothing prevents you from rolling back an individual service. You can still do that. And you can still do individual deploys too. But these are just for patch ups.

There are a ton of reasons it's not the best idea. This flies in the face of a lot of _better_ ideas.

Keeping changesets small so that it's easier to debug when something goes wrong? Blown out of the water by deploying everything at once.

Bringing every service up at once is a great way to create the coldest version of your entire product.

Requiring a monodeployment turns canarying or A/B testing entire classes of changes into a blocking rollout where any other feature work has to move at the pace of the slowest change.

> When you roll back an individual service your entire system is no longer in a valid state. It's in an interim state of repair.

The gold standard is that each version of your service can work with each other version of your service, because in The Real World your service will spend time in those states.

> Monodeploy should be the gold standard, and individual deploys and roll backs are reserved for emergencies.

No, because if it's still possible to mix versions in your services, then a monodeploy doesn't actually solve any issues.

I actually am a big fan of fewer services and deploying bigger artifacts, but if you have multiple services, you have to act like you have multiple services.

>Keeping changesets small so that it's easier to debug when something goes wrong? Blown out of the water by deploying everything at once.

The size of your monodeploy is orthoganol to the concept of monodeploy. You can make a large change or a small change.

In fact your deploy can be smart. For a specific service in a full system monodeploy when upgrading from v2 to v3 it can do some sort of diff on the source of a specific service and if there's no difference it goes from v2 -> v3 without a new build and uses the same artifact from v2 to v3. The entire point is though that this service (or the entire system) still goes from v2 to v3 and it tagged this way. This is an optional optimization for speed.

In fact, your compiler when building artifacts ALREADY does this. It caches huge parts of the build and reuses it. A deploy can do the same.

This is the important concept of a monodeploy: The static check; The integration testing. The verification of the ENTIRE system as a whole. Your monodeploy determines what new artifacts need to be recreated, what artifacts need to be reused... verifies everything, and deploys.

>Requiring a monodeployment turns canarying or A/B testing entire classes of changes into a blocking rollout where any other feature work has to move at the pace of the slowest change.

Again orthoganol. Your complaining that a monodeploy is slow. Integration testing and unit testing are also slow and take time. The monodeploy is for safety. If you're saying speed > safety, here's an idea: throw all testing out the window as well. That's a big speed up right there.

If your monodeploy is slow, work on speeding it up. Work on it being smarter and faster. Do you throw testing out the window because it's slow or do you work on speeding it up? Make the smart choice.

>The gold standard is that each version of your service can work with each other version of your service, because in The Real World your service will spend time in those states.

And that gold standard is stupid. We can do better. We can go to a state where different versions between different services don't exist. Only one monoversion. You throw that concept of different versions out the window then you also throw the possibility of a mismatch out the window as well.

You're trying to deal with an error. I'm saying make the error not exist.

>No, because if it's still possible to mix versions in your services, then a monodeploy doesn't actually solve any issues.

It's not possible to mix versions in a monodeploy because the whole concept of it is to have ONE version of everything. Let me be clear I'm talking about a MONOREPO + MONOBUILD + MONODEPLOY. If there's only one version of everything and it's all deployed than issues are solved under this model. At this point I think you just don't like being wrong.

>I actually am a big fan of fewer services and deploying bigger artifacts, but if you have multiple services, you have to act like you have multiple services.

A monodeploy doesn't preclude multiple services. You can still act like it's different services. A monodeploy + monorepo just makes sure there's ONE version of the entire system.

You're solution here is just saying you want to be able to deploy different versions of different services in a staggered way. You want different repos so different modules of the system can move out of step with everything else. Service A is at v23, Service B is at v32.

The only way to deal with this mismatch is to have complicated "versioning" system on top of that where API contracts between services only accept "backward compatible" changes. This works but it's also extra complication and extra restriction. You can no longer radically change an API because it can break a number of systems in different repos. You're stuck. Or if you're willing to deal with the fallout you can make breaking changes and accept the risk whilst under my system the risk doesn't even exist.

You are advocating for an idea that's definitively worse. But you'll never admit it, not right now anyway because basically you've dug your heals into the ground. At this point I've never seen a human who is so unbiased they are capable of proper reasoning to flip their stance. If we continue talking, you will continue to build logical scaffolding to support YOUR point rather then to support A point and it's pointless (punintended) to keep going.

I'm ok to keep going, but I think it's completely obvious to any neutral arbiter that the conversation is over and that your perspective is rationally worse.

> Second don’t use databases. Databases don’t type check and aren’t compatible with your functional code written on servers.

That isn't very useful by itself. What's your suggested alternative that aligns with your advice of "don't"? How does it deal with destructive changes to data (e.g. a table drop)?

There are no alternatives. My point is the whole concept was designed with flaws from the beginning.

>How does it deal with destructive changes to data (e.g. a table drop)?

How does type checking deal with this? What? I'm not talking about this. I'm talking something as simple as a typo in your sql query can bring your system down without testing or a giant orm that's synced with your database.

I'm not saying distributed systems are completely solved. I'm saying a huge portion of the problems exist because of preventable flaws. Why talk about the things that can't really be easily solved and why don't we talk about the things that can be solved?

Oh, I thought you were speaking more to the topic and content of the article in question, which goes to great lengths to describe the sorts of problems that are much, much harder to catch than simple compiling of queries and checking them against the database, or the message store.

Even if you were to reduce the database to a simple API, the question then remains how do you make sure to version it along with the other portions of the system that utilize it to prevent problems. The point of the article seems to be to point out that while this is a much harder problem (which I think you are categorizing as "things that can't really be easily solved"), there are actually solutions being developed in different areas that can be utilized, and it surveys many of them.

>Oh, I thought you were speaking more to the topic and content of the article in question, which goes to great lengths to describe the sorts of problems that are much, much harder to catch than simple compiling of queries and checking them against the database, or the message stor

Right. But we haven't even have square one solved which is the easy stuff. That's my point.

>Even if you were to reduce the database to a simple API, the question then remains how do you make sure to version it along with the other portions of the system that utilize it to prevent problems.

I said monorepo and monodeploys in this thread. But you need to actually take it further then this. Have your monorepo be written in a MONOLANGUAGE, no application language + sql, just one language to rule them all. boom. Then the static check is pervasive. That's a huge section of preventable mistakes that no longer exist, now that type that represents your table can never ever be off sync.

I know it's not "practical" but that's not my point. My point is that there's a huge portion of problems with "systems" that are literally obviously solvable and with obvious solutions it's just I'm too lazy to write a production grade database from scratch, sorry guys.

> I said monorepo and monodeploys in this thread

And that helps when you are dealing with schema changes that need to be rolled out at AWS, your local DB, a Kafka cluster, how? The whole point of this article was how to approach the problem when there are different components in the system which make a monorepo and what it provides for this infeasible or impossible.

> I know it's not "practical" but that's not my point. My point is that there's a huge portion of problems with "systems" that are literally obviously solvable and with obvious solutions it's just I'm too lazy to write a production grade database from scratch, sorry guys.

The article talks about database solutions that help with this problem.

I'm uncertain how to interpret your responses in light of the article, when they seem to be ignoring most of what the article is about, which is solving exactly these problems you are talking about. Is your position that we shouldn't look for solutions to the harder problems because some people aren't even using the solutions to the easy problems?

The article is about coping mechanisms for a world where we already accepted fragmented systems: polyrepos, heterogeneous languages, independently versioned databases, queues, infra, and time-skewed deployments. Given that world, yes, you need sophisticated techniques to survive partial failure, temporal mismatch, and evolution over time.

That is not what I’m arguing against.

My point is more fundamental: we deliberately designed away static safety at the foundation, and then act surprised that “systems problems” exist.

Before Kafka versioning, schema migration strategies, backward compatibility patterns, or temporal reasoning even enter the picture, we already punched a hole:

Polyrepos break global static checking by construction.

Databases are untyped relative to application code

SQL is strings, not programs

Deployments are allowed to diverge by default

That entire class of failure is optional, not inherent.

When I say “we haven’t solved square one,” I’m saying: we skipped enforcing whole-system invariants, then rebranded the fallout as unavoidable distributed systems complexity.

So when you say “the article already offers solutions,” you’re misunderstanding what kind of solutions those are. They are mitigations for a world that already gave up on static guarantees, not solutions to the root design mistake.

I’m not claiming my position is practical to retrofit today. I’m claiming a huge portion of what we now call “hard systems problems” only exist because we normalized avoidable architectural holes decades ago.

You’re discussing how to live in the house after the foundation cracked.

I’m pointing out the crack was optional and we poured the concrete anyway.

I’m telling you this now so you are no longer uncertain and utterly clear about what I am saying and what my position is. If you are unclear please logically point out what isn’t clear because this phrase: “ The article talks about database solutions that help with this problem.” shows you missed the point. I am not talking about solutions that help with the problem, I am talking about solutions that make a lot of these problems non-existent within reality as we know it.

> The second one is also stupid but you had to be there in the very very beginning to have stopped it. The inception of the concept of databases should have been created such that any query you run on it needs to be compiled to run on the database and is thus statically checked. This is something that can easily be done but wasn’t done and now we all just make do with testing.

This is a good point, I've never really thought about this carefully before, but that makes so much sense.

Yeah it's the stupidest thing but nobody even thinks about this. Everyone I talk to thinks SQL is the greatest thing ever. I mean it's the greatest because there's no obviously superior alternative.