|
|
|
|
|
by mbreese
1398 days ago
|
|
You could also store timestamps as part of the records. Then your query will always be consistent if you add an extra clause of tstamp < query_tstamp. So long as you store both the query_tstamp and the last record, you’ll get consistent results without needing to store individual cursors/snapshots for each user. (You’ll still have more CPU time per query, but that’s kind of a given here). You probably also need to switch from deleting records to adding an archive bit (or timestamp). This gets complicated fast. |
|
We struggled with this at a prior job because records within the most recent ~10 minutes could be delayed for any reason. We'd be collecting data from thousands of IoT devices and any of them could have momentarily lost a network connection for a few minutes, been rebooted, etc. So it'd be totally normal for end users to see records from 30 seconds ago, but then a few minutes later, see older records appear in between those records and older records because there were minor delays in some of them getting sent and processed.
We were leaning toward just putting a bound on the range that the end user could see (for example, only show records that are at least ten minutes old) to reduce the variability, but that gave the appearance that our system took ten minutes to process all data. It was a tricky one to solve from a user expectation perspective.