Hacker News new | ask | show | jobs
by mherdeg 1148 days ago
Best of luck to the author on their tiered monolith journey! Some things to think about:

(1) How many distinct total binary versions of your monolith will you permit to run in production? Some options might include

At most two globally ("current" and "new" canary/blue-green deployment, no special code on particular tiers)

At most two globally, but sometimes you're willing to deploy a special build to a single tier to mitigate an emergency, with eventual convergence

At most two per tier, but with no attempt to keep each tier running the same binary code (maybe you don't want to redeploy your async consumers as frequently as you redeploy your http handlers)

An unlimited number (maybe you deploy customer-specific binary code to specific instances within a tier)

Would you like an alert when there are too many distinct versions running in production? Who should get that alert and what should they do when it fires?

(2) Does this thing deploy simultaneously everywhere?

If so, is any specific person or team responsible for making sure the deployment worked ok on every tier, declaring an incident if not, and rolling back and finding an owner to resolve the issue? Will every team who owns a part of the monolith contribute someone to a shared rotation for release monitoring?

(3) Suppose there is a blocking problem in one part of the monolith, for example async message processing stops working reliably. Should this block deployment or development for other teams whose changes are outside this blast radius?

(4) Suppose some low-level intermittent compilation error prevents the binary from starting up 10% of the time after a certain build revision for every tier. What team will work to resolve this kind of problem? Is there a team writing telemetry and common logging for your monolith everywhere? Is there a team who will implement common operational concerns like feature flags to gate binary changes?

(5) Does your monolith run in any non-production environments? Is every tier running in each environment? Does somebody publish an SLO for those environment? Is one team allowed to break everyone else in pre-prod by deploying experimental code to the monolith in some environment? Who deploys to the pre-production environment and how?

(6) Suppose you discover you need to split up your workload (one kind of http request is much slower than all others and you want to separate failure domains). How much work does it take to create an additional tier -- updating deployment jobs, quality gates, and CI/CD pipelines throughout various environments, provisioning resources, setting up graphs and alerts, creating new tests? Who does this work?

(7) How will you manage configuration for your monolith? Will configuration directives be delivered to every tier simultaneously? Can someone accidentally break the behavior of another team's tier with a typo or logic error in a configuration change?

(8) When it comes time to split this thing into microservices or macroservices for a few years before a successor team looks at the mess and decides to reimplement a monolith, how do you set up your architecture to successfully allow a split?

(9) Are you absolutely sure your tiers do what you think they do? Can API customers bypass rate limiting by pointing to the hostname of your async-worker tier? If a security vulnerability in a particular http route affects your monolith, will you remember to block the route on every tier (even the ones you think don't normally serve web traffic)?