Hacker News new | ask | show | jobs
by aunwick 929 days ago
My Master's thesis was on high performance distributed computing. And my conclusion was you likely don't have a problem that is hard enough to justify it. Thank you for promoting reality!
3 comments

Microservices usually solve organisational problems not the technical ones. Like here it is probably very painful to deploy to the production.
Microservices solve revenue problems at cloud hosting and CI companies.
And maintenence and onboarding.. I guess to find the right balance is the key. I tend to go with monolith and use microservices to offload heavy dutie tasks or to benefit from other languages and framework when they are a better fit for a specific problem domain
That doesn't sound like "microservices", but like a monolith with a few services split out where it makes sense. People have been doing that for a million years, long before the web and certainly long before "microservices" was coined. It's just "using common sense" basically.
I'm not aware of such nuanced definitions. For me a microservice is a one endpoint service, designed to do one task, with full introspection, and deployed on the "cloud". As fair as I know, I can orchestrate them with monoliths and they still microservices.. or where you draw the line between services and microservices? The size of the service itself or with whom it interacts?
There is no One True Definition™, but I think most people understand "microservices" as "an application where splitting the functionality up in small services is core of the architecture", or something along those lines.

"We split off 2 things in to small services because it made sense" is rather different. I mean, I don't really care if you call this 'microservices" I guess, but it is different, right?

yup, monolith goes very long only to the point where you have hundreds of engineers or some very specific niche technical problem that I would start thinking about microservices.
That Rails app pipeline deploys hundreds of PRs a day to `main`

https://shopify.engineering/software-release-culture-shopify

> Microservices usually solve organisational problems

Or create organizational problems.

- "Who owns the flip-flop service?"

- "That's Bob's team, we fired them last summer!".

"Who owns /monorepo/shopify/infra/cloud/flip-flop/api/foo_bar.rb"?

"That's Bob's team..."

It would be without the tooling, yes. Shopify has a merge queue thingy that you can just shove you MR into and it will eventually get around to deploying it. It even gives you a ping on slack when your changes are about to go live IIRC.
Oh, "painful" is to work under the sun. Installing a program in a computer is far away from painful.
But almost everyone benefits from high availability.

So unless you're using managed services e.g. RDS you're going to be exposed to the same complexities as distributed computing.

Especially with the cloud where instances can die at any point.

How is that the case? This example uses a distributed MySQL cluster which was of course tuned for high performance. Similarly the Rails app is distributed as well. Arguably the Rails app likely wouldn't qualify as "high performance", but it's distributed.
It amazes me that even when we have numbers people still dismiss Rails.

Nothing will run at that scale on a single VPS. All companies will have a wide range of languages used.

If this is not Rails supporting high traffic then what do we need more?

Sorry, I love Rails, but because something can scale (which I never thought it couldn't) doesn't make it a high performance system. That's totally fine, Rails makes other tradeoffs that IMO are more universally useful, even though some people seem to not be able to understand that server cost for most companies is tiny compared to developer cost
For some reason, some people with discount any example of Rails scalability as not counting.
They're talking about "distributed" as in a system of services communicating, rather than just copies of the same monolith across multiple instances. The former adds communications and synchronisation over heads and complexities of failover for every extra service introduced
That's a totally bizarre definition. Having worked on a high-performance in-memory data grid for the last eight years, I can guarantee that you'll get all the fun distributed systems problems even with a single code base. That definition also excludes pretty much all famous distributed systems like most databases, messaging systems like Kafka and Rabbit etc.

What you seem to be getting at, isn't distributed systems, but the totally self-inflicted pain of a service oriented architecture

> Having worked on a high-performance in-memory data grid for the last eight years, I can guarantee that you'll get all the fun distributed systems problems even with a single code base.

Having spent the last 28 years building distributed network-connected systems, this comes across as wildly obtuse.

The point is that there are orders of magnitude differences in complexity when scaling a system with few communications paths and little distribution of state across process or network boundaries as there is when scaling one with many paths and state distributed in many locations. We don't tend to start talking about distributed systems when you have a tiered stack of a horizontally scalable component sandwiched between a load balancer and a database even though in a very strict technical sense already that is "distributed".

Once you start adding message queues etc., then it certainly becomes more and more reasonable to talk about a distributed system, but there is there as well a distinct grey area if dealing with e.g. queues just triggering jobs in the same code base against the same database with respect to the intent clearly expressed by the original comment.

Put another way, ignore the word "distributed", re-read the original comment, and consider that irrespective of which label you're comfortable with, what the comment is doing is drawing a distinction between two classes of systems with wildly different complexity in the distribution of responsibility and state. Where precisely you draw the line is entirely irrelevant.

> What you seem to be getting at, isn't distributed systems, but the totally self-inflicted pain of a service oriented architecture

No, it really was not. This separation between basic 2/3 tier apps and systems with a more complex data flow pre-dates the SOA buzzword literally by decades.

Maybe the distinction here would be one of which scope the respective maintainer cares about. For Shopify MySQL is mostly a black box, they don't need to re-implement their own atomic commit protocol, network partition detection etc., since MySQL did that for them. Implementors of MySQL did have to solve these distributed systems problems though and pick their CAP trade-offs, but I guess that's not the scope Shopify cares about here.
Aren't the full set of these numbers definitionally "high performance"?
Oh, I read the parent comment to thank them for confirming that "you likely don't have a problem that is hard enough to justify it". But reading it again, it could be read both ways.

Edit: To be clear, I agree that this is an example of distributed, high-performance which is why the comment made little sense to me.

Yes, if you take distributed to just mean "the same code on multiple machines". The GP above probably means "different code on different machines interacting" which brings its very own set of problems.
By that definition pretty much any problem you study in distributed systems theory, can occur in a system that doesn't fit that definition and the most well known examples of distributed systems like distributed data stores, message queues etc aren't distributed systems.