| I think it would be helpful if you could dive deeper why you think " Refreshing the data every five minutes in batches" is "sufficient". From my perspective: batching is more complicated, than batching. (Batching requires you to define parameters like batch size and interval, while streaming does not for example). But may be batching tools are simpler than streaming tools, but i am not so sure. Batching in general has also high(er) latency. That's why I usually don't prefer it unless: That said batching has an advantage over streaming, it can ammortise a cost that you only pay once per batch process.
With streaming you would pay the cost for each items as it arrives. Further, the mindset requirements for engineers that work with batching is different than for streaming. Each of these items can be valid concern for batching vs streaming. However, I find it difficult to value statements like "Batching" is the default because the industry has been doing this for years by default. I think the industry as a whole benefits when engineers in these kind of discussions repeat why certain conditions lead to a choice like batching. |
Not OP, but I'm guessing because most of that data is not actionable in real-time. There's zero point to get real-time data to analysts or decision makers if they're not going to use it to make real-time decisions; arguably, it can be even counterproductive, leading to an organizational ADHD, where people fret over minute-to-minute changes, where they should be focusing on daily or monthly running averages.