Hacker News new | ask | show | jobs
by CharlesW 1221 days ago
I'm curious what this enables can't be done with a thoughtfully-designed data warehouse. Looking at the "With MetricFlow" and "Without MetricFlow" examples, it seems like it's designed for people who are willing to use a new query language as long as it's not SQL?
3 comments

On the one hand, the compile targets for MetricFlow and DBT are SQL. So strictly speaking, nothing can be done with these that can't be done with a data warehouse.

SQL doesn't really have a way to define new aggregations well. So taking something like Daily Active Users / Monthly Active Users (DAU/MAU), which is a semi complicated calculation using windowed data, there's no good way to have a repeatable DAU/MAU aggregation concept you could reuse at different levels without having to re-write that sql that defines the aggregation. EG DAU/MAU against the account, or DAU/MAU by user type, or DAU/MAU by region would all be different queries/reports.

Instead if you have a layer that can be given the definitions of how a particular aggregation is structured, and it knows how data is related to each other, you can ask for that metric against different concepts or levels without having to repeat yourself.

Extending that out, if BI tools can understand that layer, then you can get consistent WYSIWYG report building across multiple tools without having to redefine the calculations everywhere.

I don't think I did a great job explaining here, this keynote is a good topic survey IMO: https://www.youtube.com/watch?v=w2weqVjbUTs&feature=emb_imp_...

Your answer was helpful, and I appreciate the video link as well. Thank you!
In practice - complicated time series metrics, especially on top of derived temporal-logic attributes like funnels, activation, etc., are phenomenally difficult to write by hand in SQL for most analysts. We are in the process of switching to dbt metrics and it cuts the effort down for this kind of thing 5x-10x, and the SQL code dbt generates runs signficantly faster too.
How much does the dbt enterprise cost?
I’ve seen reports of a few hundred dollars per seat per month. I’m sure it varies widely depending on how big the enterprise is.
Looking at the DBT semantic layer documentation I think I get it. The goal seems to be describing a metric that could be aggregated at multiple levels once instead of defining all of the aggregations separately.

For example, if you want to be able to aggregate revenue by subscription plan and by country over year, month or day, then you would have 6+ different aggregations. The more dimensions and time periods you add, the more the various combinations blow out. (I think the formula is dimension permutations multiplied by the number of time periods.) Having a single definition of the metrics prevents errors from occurring in rote implementations of the metrics. A bit like generics, but for queries.

At this point, I'm curious how they manage to make the integrations with different platforms work. (Ultimately, I think that's why they acquired Transform.)

An alternative to Transform that's done an awesome job to integrate with BI tools is Cube.js Check their website https://cube.dev
I think that’s still the outstanding question. Tools like Mode and Hex have already started supporting dbt’s metrics layer, but will big players like Looker and Tableau also move to support it? That’s harder to say
One of Transform's strong suits is their Tableau integration - they have really good tooling for pushing metrics out to Tableau, rather than relying on Tableau to pull them in. Apparently AirBnb was/is still a very dedicated Tableau shop.
Minerva actually had really poor tableau support as of a year ago, that is something Transform improved on. Most of Minerva consumption was done via Superset, GoogleSheets, and email