|
|
|
|
|
by VyseofArcadia
867 days ago
|
|
> In general, one of the goals of microservices should be that if one of the five services goes down, the other four should be able to operate in some capacity still. This is an excellent point, but what brought this to my mind was that the microservices in the Netflix article I don't think have this property. It looks to me if any of the VIS, CAS, LGS, or VES go down, then the whole service is effectively down. Indeed, in my own career what I've seen is that if one microservice goes down the user won't be seeing 500 errors or friends, but the service will be completely useless to the user. You've just gone from a hard error to a spinning load icon, which might in fact be an even worse user experience. It could be argued that this is just "you're doing microservices wrong", but then we start getting into no true Scotsman territory. |
|
Exactly what it does is that first few hours of triage call goes with people claiming "well my service is up and issue is somewhere else". So find which service failed itself take crucial hours instead of fixing the failing service.
But in a world where Micro Service Incident Commanders can pinpoint failing a service among 1000 micro service within seconds on their vast 80 inch monitoring consoles and direct resolution admirals to fix in next 15 mins. It might just all work fine.