|
|
|
|
|
by dmattia
867 days ago
|
|
In general, one of the goals of microservices should be that if one of the five services goes down, the other four should be able to operate in some capacity still. In practice, this can make the math quite a bit messier, but I don't think it necessarily has been worse overall from my perspective. So instead of having your system be up or down 99% of the time in a monolith, you'll have it fully up 95% of the time (using your numbers), but of that 5% of downtime, 20% of the time one of your products will be running slowly, or 10% of the time some new feature you launched won't work for specific customers in some specific region, etc. At my company it makes things like SLA/SLO guarantees for "our services" pretty complicated in that it's hard to define what uptime truly means, but overall I think the five microservice approach, when done well, should have less than 1% of complete downtime, at the cost of more partial downtime |
|
This is an excellent point, but what brought this to my mind was that the microservices in the Netflix article I don't think have this property. It looks to me if any of the VIS, CAS, LGS, or VES go down, then the whole service is effectively down.
Indeed, in my own career what I've seen is that if one microservice goes down the user won't be seeing 500 errors or friends, but the service will be completely useless to the user. You've just gone from a hard error to a spinning load icon, which might in fact be an even worse user experience.
It could be argued that this is just "you're doing microservices wrong", but then we start getting into no true Scotsman territory.