Hacker News new | ask | show | jobs
by oroul 1660 days ago
Does it matter that dbt is written in Python? dbt models are still SQL at heart. Sure there's the addition of Jinja to enable references for data lineage and configuration, but it's all compiled to SQL with fine-grained control over what's produced.

Forgive me if I come across as combative, but I don't understand generic appeals to things like a language being rigorous. Rigorous to what end? What problem is it solving where that is required? If you know something specific in this domain where that level of rigor is needed, why not share what it is?

There are a lot of problems in the analytics space (and a lot of opportunity for new practices, tools, and businesses), but I would argue that at the end of the day the primary issue is whether or not data producers choose to model data such that it is legible outside of the system that produced it much more than it is about any particular language or methodology.

1 comments

For typical datasets (~95% of companies have medium data on the order of gigabyte), you are 100% correct that the data modeling / formatting is the biggest challenge.

Having well modeled data that matches the business domain is a massive (2-10x) productivity boost for most business analysis.