Your appeal to think about it operationally helped me understand. I have to answer questions like the ones you pose whenever something odd pops up in the logs. Thank you.
As a concrete example: we work with enterprises with people numbering anywhere from 10K to 500K to government-scale, and each person may have a desktop/laptop/phone, and all the servers/printers/switches those connect to, and at the logical layer, all the applications and services for making it useful. We'll see multiple central logging systems, hierarchies of administrators, and the results of mergers, acquisitions, and one-off or zombie projects. These organizations are getting sophisticated enough to log 10M, 1B, etc. alerts a day (ex: using graylog or splunk), so we need to focus on the next step of being able to point to one alert and asking what's happening around it.
It's a really fascinating data problem, so we've been loving building tools for seeing into it!
Indeed it is. Ingesting 100k events per second into one or more centralised log management platforms will not be efficient it your're relying on a row-based analyses approach.
Image having to do that for 30k machines, 8-9k access points across 100+ different locations accessing hundreds of different systems - it does not work efficiently with visualising the dependencies automagically.
It's a really fascinating data problem, so we've been loving building tools for seeing into it!