|
|
|
|
|
by edmundsauto
1585 days ago
|
|
I’m building a hosted data warehouse for different verticals. My goal is to target people interested in doing analysis, but acquiring data and setting up even datasette is too complex, especially if the data needs transformations to be easier to comprehend. Then I’m building that into a platform where people can fork any query, modify, and publish with their own analysis in order to build a portfolio. My first market is sports data. There are many aspiring analysts, and I want to 10x the number of people who do this work. And I think the best way to learn Analysis is SQL, and the best way to learn SQL is by building off other peoples queries (learn by example / exploration). |
|
This is what OLAP-like engines are built for.
When you have these types of queries, the relational model ends up degenerating in a star schema with queries issuing a join for each data column on the first projection, and then a pass on that projection for aggregation, typically working on a time range that's relatively recent.
For these, native columnar stores are usually a better option. Things like Apache Pinot https://pinot.apache.org/ might be a better fit.
If you add a real-time requirement, it gets even more challenging, going into the realm of custom built query engines, such as those that back products as those built by Medallia or other customer experience companies.
It's a really interesting niche.