| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sradman 2169 days ago

I like the methodology of the Join Order Benchmark (JOB). The key takeaway is PostgreSQL specific:

> ...the most important statistic for join estimation in PostgreSQL is the number of distinct values. These statistics are estimated from a fixed-sized sample, and we have observed severe underestimates for large tables.

Live statistics, incrementally updated on DML execution, is a key feature for a good query optimizer. As a zero-administration RDBMS, SQL Anywhere had gained a reputation as a best-of-breed query optimizer [1] a decade ago; I'm curious if this still holds true.

In the last decade, the importance of OLAP queries in row stores has diminished due to the superiority of column stores. I'd be interested in a comparison of the Citus query optimizer vs. say Presto.

[1] https://www.student.cs.uwaterloo.ca/~cs448/W11/cs448_Paulley...