| Stas: "The issue is that majority of the companies don't experience a 70% revenue growth to catch up with the growth in costs" I think you are misunderstanding something very fundamental here. Snowflake has usage pricing and no one is forcing companies to use Snowflake 70% more every year. In my experience, companies are typically evaluating spend on other platforms and after some testing, moving additional workloads there to displace cost elsewhere. Let's say your Snowflake bill was $100k and you were unhappy with your your security data lake provider and replace a $1M bill there with $200k of Snowflake. Your Snowflake bill has now increased 200% to $300k, but you are still $800k ahead overall. In other words, your existing workload (the original $100k) didn't get more expensive. I've worked in data warehousing for a lot of years now and stepping back, I guess I don't understand what you are trying to accomplish here. I certainly think everyone should take a "trust but verify" approach with their vendors but honestly, I don't think you proven your case, especially since you appear to complete ignore the competitive reality these vendors live in. Beyond that, I don't think "speeds and feeds" are the most important improvements going on with these platforms at the moment. Check the monthly release notes: BigQuery: https://cloud.google.com/bigquery/docs/release-notes
Databricks: https://docs.databricks.com/release-notes/product/index.html
Snowflake: https://docs.snowflake.com/en/release-notes.html Performance is important but it doesn't exist in a vacuum. What percentage of features in the past two months for each of these platforms relate to performance? On the flip side, how much does your company spend on things like data governance? How much would a data breach cost? How many people maintain the platform? What do pipeline failures cost? How is connectivity to other solutions your company uses? If you look at where innovation is happening (and this is a VERY interesting space these days), the bulk of improvements are in areas arguably more important to companies. BigQuery has added migration improvements, Databricks has added Photon and Unity Catalog improvements, Snowflake has added Java and Python stored procedures. The list is miles long for all of these vendors and I challenge anyone in the space to keep up with everything. Another comment here said all of these vendors are within 10-20% performance of each other. If that is true, in my opinion you're focused on a problem that is an edge case at best. Something to watch, but not nearly as interesting or as impactful as the rapid pace of innovation across this space in all areas. IMHO. |
Fair point, some of that net revenue increase is because of consolidation of workloads, although the majority of the cost is likely still driven by consumers expanding usage beyond what they expected. As I mention in my article, the second part of increase in costs has to do with data governance, and my argument is that snowflake doesn't make governance easy. Why can't they stand up a IAM-like service with a nice UI and dashboards? why can't they make integrations with pagerduty, slack, email work out of the box? Why can't I specify team based budgets and instead have to do it on a per warehouse-team basis? Why do I have to build custom bespoke tooling on top to make governance work?
I can unequivocally say that at a certain scale you need to move on and that Snowflake and many of the SaaS providers are too expensive even at medium scale companies. This article describes this paradox better than I could: https://a16z.com/2021/05/27/cost-of-cloud-paradox-market-cap...
Moreover Snowflake's enterprise pricing model is even more non-scalable. Why do companies often have to pay two times higher price per credit relative to the standard model? Shouldn't guarantees on security or support come with a fixed cost? Shouldn't enterprise offer economies of scale in pricing?
I also wish folks would read my article from end to end because my conclusion in the article is that you don't really have a choice but to use an enterprise solution when your scale is small. If I had to start my own company and had only 2 data engineers, you betcha I would use Snowflake and DataBricks.
--- btw, it really surprises me that nobody has commented on the workload manager. Am I the only one seeing that as an issue? I have enough exposure to compare it with Redshift and I can say that Snowflake's workload manager is just very bad at optimizing throughput.