| Simple. Use the principles I said above, for distributed and asynchronous systems. The SQL server should have a way to do pubsub. It can then notify your PHP webhook via HTTP, or run a script on the command line, when a row changes. It should have an interface to subscribe to these changes. If your SQL data store lacks the ability to notify others that a row changed, then there’s your main problem. Add that functionality. Make a TRIGGER in SQL for instance and use an extension. If MySQL really lacks this ability then just put node.js middleware in front of it that will do this for you. And make sure that all mysql transactions from all clients go through that middleware. Now your combined MySQL+Node server is adequate to help clients avoid stale data. As I already said, if you for some reason refuse to handle push updates, then you have to store the latest row state (etags) in your PHP cache (apcu). And then when you access a bunch of cached items on a request, at the end collect all their ids and etags and simply bulk query node.js for any cached items that changed. You can use bloom filters or merkle trees or prolly trees to optimize the query. Joel Gustafson had a great article on this and now uses it in gossiplog: https://joelgustafson.com/posts/2023-05-04/merklizing-the-ke... |
What if an SQL update changes a million rows, and you had cached the sum of those million rows? Should it send a million notifications? Should it re-run the sum for you? What if it's a complex operation and it takes a while to re-compute, and another update arrives before the re-compute finishes?
And of course, you will always have some partitions, so you will occasionally need to re-query the data anyway and re-establish any kind of pubsub system. And the complexity in efficiently reconciling two views of a large amount of data is a major issue, that you are just brushing off as "you can do various things to maybe optimize the query".