Hacker News new | ask | show | jobs
by Sesse__ 867 days ago
After a quick look, I'm not sure if I would call this “industrial strength”. In particular, the join optimizer (typically the heart of a large-scale SQL optimizer) looks very rudimentary? And the statistics it uses have zero idea about correlation, no histograms beyond min/max…
1 comments

I was wondering about the same claim. However, I believe that JOIN's are a common weakness among OLAP database engines, and DataFusion is built on top of a columnar storage format - Apache Arrow.
By being columnar, I guess you could say DataFusion has a good executor, but no, not a good optimizer.
Not that I was trying to make any of those claims but just trying to correlate the domain with what appears to be a common problem in it.