| Interesting article. Some of it accurate. Some not. >"Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue"
Completely untrue. There is constant optimization of scheduler, execution process, global services, and compute fabric. The famous "we shipped AWS Graviton and it's like 10%" cheaper was something we did to ourselves. There is work underway to make FoundationDB faster/more efficient too that's totally out of this world. In short, nobody wants to burn extra CPU cycles and bill you for it. >"Disclose Hardware Specs"
This isn't hard to find if you work with Snowflake's SE and Services, but it's not going to give you anything. The whole POINT of Snowflake is to hide all this nonsense and make it "just work". You want CPU and SSD metrics, feel free to use Databricks (many do) or whatever. Now, there IS something to be said about some sort of observability into query execution as it is going. There are constant discussions on that, and some of the new upcoming features (like programmatic access to query profiler) can open that up. But yeah, Snowflake is NOT something that will open up what's under the hood and it is super intentional >"Not adopting benchmarks"
This goes around and everyone freaks out. Just profile your own work. Whatever. Nobody cares about benchmarks. >"Optimizer gremlins"
Snowflake COULD do more to expose some of the internals. My job (and job of 100s of my services and technical SE colleagues) is to help customers understand what's happening under the hood. Some of the company's "make it simple" ethos COULD be a bit more open. However, much of the common things (MP pruning) can be solved by simple user education. I've lost count of how many customers I worked with who had 0 education in Snowflake and even like 20-30 minute intro in it made them open their eyes and go "woah, I get it now". On other hand, dozens of people told me that it was amazingly easy to use without training, and it IS! >"Improve the workload manager to increase throughput"
Workload manager is considerably more complex and sophisticated than this guy tells us it is. I saw an internal presentation on its internals that I asked to convert to a confluence article which thankfully happened pretty quickly and lots of people benefitted. There is cost-based scheduling that takes expected resources of queries to schedule and also considers actual resources consumed, all very frequently and for every XP. I wish that article was public but I think it will not be made one, but still, it's definitely not FIFO. >"Not providing observability to monitor and reduce costs"
This is valid feedback now and constantly what we do in services. New manageability features are coming to help with this. See CapitalOne or bunches of companies in this ecosystem. >"What companies that use Snowflake could do better?
I agree with point about education. Huge portion of people using and abusing Snowflake don't have any formal education. Best think you can do is hire Snowflake PS or get a partner/SI, or just take a damn class, they are REALLY good. Source: 2 years in services at Snowflake with focus on perf, cost, and manageability. |