Hacker News new | ask | show | jobs
Decoupling a core service from your monolith the right way (betterprogramming.pub)
61 points by eallam 1111 days ago
10 comments

I don't understand why people choose to put an HTTP barrier in their code. It would have been much better if they had stopped at a Billing module. They get all the code factoring benefits they seek without the performance and operational overhead of an additional service.
I think it's a branding thing. Basically the cargo cult (which copies what's new/hype/difficult) latched on it. Most of these hype things die in a few years with no lasting legacy or resume value though (mongo, serverless, et al).

Someday hopefully some genius proposes "hyper-sidecar-ification" where you take microservices and package them together in a sophisticated way to avoid the limitations and latency of an http barrier. As long as it's new & complicated & buzzwordy (even if it's just monorepo again) it can catch on.

The latency is kind of a FUD with topology aware routing.
It’s not FUD if you are making hundreds or thousands of calls. Many “chatty” apps slow down dramatically when the wrong network boundary is drawn in the code.

The difference between a function call and a network request is vast for even modern networks.

This is true when you have 1 microservice. When you have dozens of them waterfalling requests through each other it starts adding up really fast.
It's because the code smells bad and they're having a hard time getting motivated to clean it up piece by piece because even when their piece is clean they'll still have to deal with the smell of the adjacent pieces. Http lets you say:

> That dumpster fire over there is not my problem

Without having to say:

> I want to rewrite the whole thing from scratch, you'll have to deal with it being down for a few months

Again, you don't need HTTP for that either. The GP's proposal achieves exactly the same thing.

IPC (networks, FIFO, pipes or whatever) decoupled services solve problems more on the ops side, from binary compatibility to resource segregation. They do very little on the dev side.

Of course you don't need it technologically speaking. Between DNS and SSL and stuff like openAPI, HTTP comes with a lot of baggage that is redundant when used on the back end.

My point is that it's not a technology problem in the first place. It's a problem of clashing senses of style and taste. It's a culture problem. If you need to dress up your culture problem fix with a technology explanation, http is the way to go because it's widely understood and so boring that nobody is going to ask you to explain it in too great of detail.

So, I have to disagree again.

Your language's module system is boring. An specialized web server is about as close to the opposite as a corporate structure will allow.

One of HTTP's biggest strengths here are that it's essentially perfectly language-agnostic (in practice, not just theory) and it lets you solve basically all networking problems (hardware and software) in normal, off-the-shelf, trivial-through-Google-scale ways without having to change it. And if you want to build something custom, there are more experts and documentation and examples than any other system.

I agree it's suboptimal for performance. But it absolutely cannot be beat for business stability and flexibility. And your IPC costs generally do not dominate your request times anyway.

Yeah, I think this is the realistic answer. Technical solution to a social problem etc
Yes! I call them “modular monoliths” and fight tooth and nail to get people to stop introducing network when it doesn’t need to be there.

I want to make people do Bart Simpson style writing on the chalkboard “Microservices don’t make things easier” 200 times

When you're releasing changes daily, a dozen groups are all touching the same code, and all of a sudden one of those ten changes kill the application with OOMs galore, your ops people are going to have a bad day. Oh some minor component written by a newer dev added a non-reversible DB migration in that same push? You're having a very bad day(s). Microservices are not the panacea that will work for all organizations/products/workflows, but damn do they do amazing things for others. Running a platform with dozens of teams with loose organization, distributed timezones, conflicting priorities abound, it's nice to not worry about other teams poofing our perfectly happy chunk of the company from doing it's thing.
Yes, but these services need not be “micro”.

If someone twist my arm and off and convinces me to do some consulting, I’ve seen thousands of services started by individual developers, where there is no accounting, so the ops people don’t know what needs to be kept up or what can be shut down.

Definitely separate regulatory differences (marketing vs credit card processing for example). Generally, separate out high velocity code changes from slow velocity code if possible.

I wouldn’t get in the habit of database migrations having anything to do with code pushes. Unless you have no users, I guess. Yikes. So much to say on that topic alone.

There are plenty of genuinely good reasons to split up monoliths. I’m not attempting to say that all SOA is bad.

What I am saying is that companies who break apart monoliths as a code organization tool are making a very bad decision. They’re getting all the downsides and none of the upsides of microservices.

I do agree that at some point it breaks down. But dozens of teams probably should be split for organizational reasons anyways so splitting up tech isn't much more overhead.
> splitting up tech isn't much more overhead.

This is the fallacy. Splitting up tech has huge overhead.

There’s lots of good reasons to do it. But “operational efficiency” is not one of them, even though it’s oft cited as one.

I think that while the approach is good, there was a better option (IMHO). Specifically, I would:

1. Create protobuffers (or similar) messages that wrap the requests and responses. E.g. CheckSubscriptionPaidRequest and CheckSubscriptionPaidResponse.

2. Refactor your code so that the fields in the messages defined in #1 are your only mean to pass/retrieve information.

3. If need be, expose the service through GRPC or similar.

4. Repeat for each service/endpoint.

This way, you don't incur in any overhead apart from the negligible creation of the protobuffer messages instances.

I think they wanted their own deploy timelines.
In practice there's not much difference if you use a wrapper application. The wrapper application sets the version of modules and provides whatever glue they need. For Rails that would be gems in the Gemfile and mounted engines. To deploy a new version of your module you bump the version in the Gemfile and roll the nodes.

Essentially you have:

Modules = Microservices

Wrapper = Terraform/Helm Charts

And practically they work the same way.

This is another thing that used to make sense but no longer does. Back in the day we had pets and not cattle. You couldn't just roll someone's servers and expect everything to be fine. But today we write stateless cattle that can be killed at any moment.

This seems to side step governance of the repository pulling said module version
You need governance of the client choosing a specific version of the service too.

Or, in other words, you can't have teams cooperating without governance.

If you allow at most 2 versions and expect team introducing change to rewrite all code...

At small size of code, tooling for that may be off-the-shelf for monolith.

Are there any rewrite told that work cross repos though?

Exactly.

If your concern is code health, a compile-time dependency works great.

If your concern is resource management, then you need a runtime dependency.

This.

The end of the article covers why they felt complete isolation was worth the network costs. Maybe this true for their organization.

From working on one of the largest ruby code base for the last 5+ years, I see the massive benefits of isolation without introducing the http barrier. Yes, service isolation can let you ship faster. In practice, the network barrier will make some kinds of rollbacks easier and other far more difficult.

The reliably of the whole system is a lot more costly with the added network failure modes.

Over time, the proliferation of services impose an ever growing maintenance tax to keep libraries up to date and mitigate security vulnerabilities across an organization.

Tl;dr there are no free lunches.

In theory, separate teams can manage separate deployment schedules, etc more closely aligned with their feature work.
In theory, you mean if there were like separation of concerns. Like the billing address being in the billing service instead of the customer service?
You have to draw the line somewhere, or Chrome would have to ship with the whole WWW.
Scaling and load balancing.
This is one of those things that was true in like 2008 but is no longer true in 2023. Scaling a monolith is slightly more expensive in memory consumption but in practice it's irrelevant for most cases.

In this particular case it's worse due to how Rails works. Usually people deploy one Rails thread per core and that core is blocked by the thread.

If that's the case when Service A calls Service B, Service A is blocking a core and waiting until Service B completes. That's effectively doubling resource consumption during that call. You have one server waiting on another server to do work it could have done itself.

I don’t think most modern Rails deployments have this problem anymore. Puma does pretty well with parallelization
Yeah, and with Fibers things are getting even easier to do. But there are other performance issues and ultimately the benefits are minor.
Ruby threads suspend on IO so a blocking HTTP request is very low cost.
From a business perspective, unlocking this precision for infrastructure is only worth the investment at the highest levels of scale.
> and since everything lived in the same monolithic, the billing codebase was coupled with other core modules like transfers and authorization

This doesn't follow, but everyone always seems to think that it does. They even demonstrate that it doesn't in Phase 1! "Decouple billing logic within the monolith"

This is building a SOA distributed monolith, which is kind of cutting off your nose to spite your face. I've been there - would not recommend.

It makes the system brittle, slow, and forces strong commitments that dependent services remain up (rolling releases with non breaking migrations, etc).

If I were to do it again, then I would first ensure that the infrastructure is there for inter-service communication to be done asynchronously, and that changes are eventually consistent. Maybe using a workflow manager like Camunda or Temporal. Or even just event choreography between services - either of those is better than a synchronous HTTP call chain of what will become 7 dependent services.

I agree that asynchrony would be a stronger technical solution, but it’s not the defining characteristic of SOA. They author mentions circuit breakers so seems like they did think about network resiliency.

Other then async what would make it less of a “distributed monolith”, can you say?

I am using SOA out of place and I should not have included it in my comment - I agree.

I guess distributed monolith is a nebulous term, and I'm sure people have their own criteria. To me, the defining characteristic IS the size of that synchronous call chain. If to serve some of the public operations of your system you need to make a synchronous HTTP call that spans more than one service, then I consider those services to be too tightly coupled and the system is closer to being a distributed monolith then a set of independent services (I'd make an exception if the first service is an API gateway or is very explicitly a kind of middleware service, and not defining business logic).

The degree to which the system is a distributed monolith, and how much one should care about that fact or invest effort to steer away from it is a function of how big the biggest one of those call chains is. I don't have a binary definition, more of a sliding scale. The way to avoid sliding more into the direction of a distributed monolith (at least the way I reckon it) is to avoid making those call chains from the get go.

Thanks @liampulles, appreciate your insight. Going through the same exercise you've helped me crystallize my thoughts.
> If a rollback were needed, some features would have to be implemented in both the monolith and Billy. This was time-consuming, so moving fast to remove that code from the monolith and rely 100% on Billy was essential.

This is the hardest bit, where if the monolith is relying on a shared db transaction between the client and service the network boundary makes them separate. Even without explicit rollbacks, at any high scale/load there can be timeouts/failures leaving an inconsistent data state.

The article suggests migrating a less critical client first, then developing such consistency mechanisms before migrating the more critical clients.

The article suggests using gitsubmodules, which I wouldn’t recommend.

Mainly because depending on another repo can be flakey. If you depend on another repo’s tag i.e. “submodule@v1.3” that tag commit could be changed, which could break the build.

If you depend on another repo’s commit hash i.e. “submodule@hash” then you depend on nobody rebasing or “git push -f” which could remove that hash from the git history.

All these problems disappear if you use a mono-repo, rather than a git submodule…

I work with a large monolith, and we have been trying to decouple some of out core features for the past couple of years. The one lesson I learnt is that you have to start by making sure new features are not built on your legacy app. Feature flags is also a must.
Grug don’t understand why take decoupling and add network call
A very pragmatic use of SOA / microservice.
Why do people still use Medium?
My blog is hosted on GitHub pages on astro[1]. I'm considering mirroring to Medium and Substack to get more SEO. In the spirit of POSSE: https://indieweb.org/POSSE

[1] https://mteam88.github.io/

Because Large is expensive, and Small is limited ?