| I don't really understand what the heavy emphasis on "real-time" here. I mean it's log/event aggregation for ops insight. Unless the whole system is some tightly coupled feedback loop into an unsupervised machine learning model where the whole thing has actual hard-real-time requirements (something which might well be impossible to build), then there's no possible way that having a second or two delay, between a message being created and when you can actually see it, can possibly matter. I mean you don't have someone with instantaneous reflexes and resolution ability sitting there 24/7 with their eyes peeled as a stream of thousands of log messages flies by. The whole premise seems spurious. Is it really necessary for every startup and their uncle to delude themselves into thinking that their use case is "mission critical" or "carrier grade" or whatever? Auth0 provides basically "login as a service". Its not like they're managing the access control to nuclear launch codes or something. Unless some medical device manufacturer was stupid enough to make a critical surgical assistance device require an internet connection, a WAN round-trip on unreliable networks, and reliance on a 3rd party service in order to start operating it... how can this service being down possibly be anything more than an annoyance? What's the worst possible scenario? A session has to be rebuilt? A user has to make an extra login attempt? By their own admission the service has gone down already due to the old system architecture. How many babies died? |
Also, the emphasis is not just the real time aspect. The article mentions the issues with kafka for HA