| Thanks for your feedback! > Something McKinley doesn't address is that it's quite advantageous if the values in your data warehouse don't change intra-day because this lets business users reach consensus. Whereas if Bob runs a report and gets $X, and Alice runs the same report 5 minutes later and gets $Y, that creates confusion (much more than you would expect). I recall a particular system I built that refreshed every 6 hours (limited by upstream), that eventually Marketing asked me to dial back to every 24 hours because they couldn't stand things changing in the middle of the day. If they want to see a consistent view of the report, you could bound this. 1/ SELECT * FROM FOO WHERE DATE_TRUNC('day', updated_at) < DATE_TRUNC('day', DATEADD(day, -1, CURRENT_DATE())); If your dataset doesn't contain kv, you can turn on include `artie_updated_at` which will provide an additional column with the updated_at field to support incremental ingestion. 2/ If you had stateful data, you could also explore creating a Snowflake task and leveraging the time travel f(x) to create a "snapshot" if your workload depended on it. 3/ Also, if you _did_ want this to be more lagged, you can actually increase the flushIntervalSeconds [1] to 6h, 24h, whichever time interval you fancy. You as the customer should have maximum flexibility when it comes to when to flush to DWH. 4/ You can also choose to refresh the analytical report on Looker / Mode to be daily. [2] > Now of course I see you're targeting more real-time use cases like fraud detection. That's great! But why you would run a fraud detection process out of your data warehouse, which likely doesn't even have a production-grade uptime SLA? Run it out of your production database, that's what it's for! You can certainly do this in production db (that was our original hypothesis as well!), however, after talking to more companies...it has become more obvious to us that folks that are running fraud algos actually want to join this across various data sets. Further, by using a DWH - it provides a nice visualization layer on top. Of course, you could go with something even more bespoke by utilizing real-time DBs such as Materialize / Rockset / RisingWave. Just comes with trade offs such as increase in architectural complexity. There are also plenty of additional use cases this can unlock given that DWH is a platform, any post-DWH application can benefit from less lag, such as reverse ETLs. [1] https://docs.artie.so/running-transfer/options [2] https://mode.com/help/articles/report-scheduling-and-sharing... |