|
|
|
|
|
by BeefWellington
606 days ago
|
|
I'll use a contrived example here to explain what the value of streaming the data itself is. Let's say you run a large installation that has a variety of very important gauges and sensors. Due to the size and complexity of this installation, these gauges and sensors need to be fed back to a console somewhere so that an overseer role of sorts can get that big picture view to ensure the installation is functioning fully healthy. For that scenario, if you look at your data in the sense of a typical RDBMS / Data Warehouse, you would probably want to save as much over the wire traffic as possible to ensure there's no delays in getting the sensor information fed into the system reliably on time. So you trim down things to just a station ID and some readings coming into your "fact" table (it could be more transactionally modeled but mostly it'll fit the same bill). Basically the streaming is useful so that in near-realtime you can live scroll the recordset as data comes in. Your SQL query becomes more of an infinite Cursor. Older ways of doing this did exist on SQL databases just fine; typically you'd have some kind of record marker, whether it was ROWID, DateTime, etc., and you'd just reissue an identical query to get the newer records. That introduces some overhead though, and the streaming approach kind of minimizes/eliminates that. |
|
What I don't understand is streaming joins. None of your gauge values need to join to anything.
And if they did -- if something needed to join ID values to display names, presumably those would sit in a database, not a different stream?