|
Thanks! As for the nature of our queries, almost all of our use cases are group-by with count(*) plus some constraints. So we're interested, for example, in the number of views, grouped by a few dimensions, over a specific time interval (per month or quarter, typically), with some constraints, including time (last 12 months, often). Usually the user selects a whole bunch of dimensions, and we display this in the UI as a table where we pivot or nest based on the dimensions. For example, if you group by month, by region and customer, then you might get the months as horizontal columns, the regions as the vertical column, with totals for each region, then the customer nested under the region, with totals for each customer within the region. (The underlying query gives us a flat table, which we convert to a kind of dimensional hypertable structure for display.) Our application is schema-agnostic, which is why we use JSON. If we were to avoid JSON, the only realistic option would be for the app to use SQL to create tables, and handle schema migrations, and sort of control the schema. That would make it a somehwat different app. Of course, for many time windows, we're talking about tens or hundreds of millions of events. Elasticsearch is very fast at aggregating data and lets us do queries that span a few months in just milliseconds, whereas grouping an entire dataset containing years of data typically takes maybe 4-5 seconds, still fast enough to be acceptable for a reporting UI. In my experience, Postgres isn't as fast at counting. |