Hacker News new | ask | show | jobs
by otter-in-a-suit 634 days ago
I’m the author, but posting as a private individual here, these being just my options and all that… but I can shed some more light on why I did move us to Superset.

Preset is great, as are most of these tools’ hosted versions! Lots of great folks working on these.

But, tbh, as an infrastructure company this is somewhat the core business of ngrok - hosting another DB + K8s service is something that we have great tooling for and lots of expertise in the infra space. And using ngrok makes it even easier.

The whole dogfooding aspect is important too - if I don’t run an app in production with ngrok I have a hard time empathizing with customers who want to do the same. My previous job encouraged that too and I’ve always liked that.

Also, yes, lots of moving parts - but most of them are very reusable and they share a lot of code, infra, and logic/operations playbooks etc. Costs are manageable - Athena charges $5/TB scanned iirc, which tends to be the biggest factor.

2 comments

Appreciate you taking the time to reply :)

I guess the underlying tone of cynicism in my tone speaks to the question that I didn't directly ask - how often do each of the components/moving parts fail and require manual intervention/fixing?

I often get pulled into complex distributed systems and the team responsible for that flow (data or not) often have no idea where to begin.

Edit* On the point of Athena, I desperate wanted to use it but provide BigQuery to be much better in every way you could think of. It's the black sheep in the company, as every other cloud thing we have is AWS. But honestly, nothing I've found in AWS circle comes close to BigQuery.

BigQuery + Metabase is such a powerful combination. Easy, affordable, effective.
I appreciate the time you took to write this all out (both the article and your response here). In particular, this line from the article resonated with my own experience over the last couple of decades:

> This particular setup—viewing DE as a very technical, general-purpose distributed system SWE discipline and making the people who know best what real-world scenarios they want to model—makes our setup work in practice.

The common analyst-to-DE path has some benefits for sure with respect to business-centric data modeling, but without the deep technical infrastructure investments and related support, the stack becomes a beast to deal with at scale (or just ends up being a massive cost on the balance sheet from outside vendor sourcing). You really need both verticals in order to be optimal IMO.

Of course if internally an org doesn't already have the platform/infrastructure to dogfood in the first place, this admittedly makes the proposition a bit more of a gamble.