Hacker News new | ask | show | jobs
by pavel_tiunov 2741 days ago
Thanks for the question! We should do a better job describing this. In short: 1. Generates analytic SQL queries based on Cube.js schema. It can be simple ones like calculating page views or more advanced like calculating session metrics, attribution models or funnels. 2. Caches sql responses to not to overwhelm SQL backend with user requests. 3. Pre-aggregates data to be able to query trillions of data points in matter of seconds. 4. Orchestrates SQL query execution. Organizes dependencies between pre-aggregations, queue priorities, cache refreshes. 5. Provides REST analytic API for end users.
1 comments

Why do you need to select all rows in your Cubejs section when you can directly run query in athena and get back the aggregates you need.

Basically you select all rows then cubjs does something on those rows when you can infact directly run queries in Athena

Am i missing something?

It actually works exactly as you describe. We generate SQL query to return aggregates based on SQL supplied in Cube.js schema. We never fetch raw data from SQL backend. Architecture overview can probably help to understand: https://github.com/statsbotco/cubejs-client#architecture