Small. We're dealing with financial accounts, holdings and transactions. So a user might have 10 accounts, thousands of holdings, 10s of thousands of transactions. Plus a handful of supplemental data tables. Then there is market data that is shared across tenants and updated on interval. This data is maybe 10-20M rows.
Just to clarify, the data is prepared when the user (agent) analytics session starts. Right now it takes 5-10s, which means it's typically ready well before the agent has actually determined it needs to run any queries. I think for larger volumes, pg_duckdb would allow this to scale to 10s of millions rows pretty efficiently.
Just to clarify, the data is prepared when the user (agent) analytics session starts. Right now it takes 5-10s, which means it's typically ready well before the agent has actually determined it needs to run any queries. I think for larger volumes, pg_duckdb would allow this to scale to 10s of millions rows pretty efficiently.