Hacker News new | ask | show | jobs
by ttfkam 1015 days ago
That isn't my experience at all. MySQL wins at serving a large number of very simple SELECT queries and plain bulk INSERTs.

But Postgres wins hands down once the queries get slightly more complex, for larger numbers of concurrent UPDATEs, and kicks the pants off MySQL with a RETURNING clause where you don't have to perform a followup SELECT after each write, especially to get the new ID. (Necessary for writable CTEs and explains why MySQL doesn't support them.)

And don't get me started on MySQL's lack of range types and their associated indexes. Exclusion constraints can be a godsend to data validity.

MySQL has improved greatly since 8.x, but it's still very far behind the capabilities of Postgres, MS SQL Server, DB2, and Oracle.

MySQL does simple things fast though not as fast as SQLite. It's getting squeezed on the high end by all the other big SQL vendors and squeezed on the low end by SQLite. It lives in a functionality gap that keeps getting narrower and narrower.

2 comments

Overly basic benchmarks aside it’s the slightly more complex queries or starting in the hundreds of thousands or millions of records at the very start of the data was night and day.

Learning to do things well in Postgres while being fairly competent in MySQL for the better part of 15 years was still a gap.

When I saw the ability or availability of extensions - part of me did wish I spent more time with Postgres instead of MySQL/MS SQL/Oracle, etc alone.

Can’t speak to your experience or schema, but IME MySQL does just fine at high (100+K QPS) mixed workload, with complicated queries. It does require more tuning than Postgres, but OTOH there’s more to monitor to determine exactly what is bottlenecking. That’s certainly not to say Postgres can’t also handle volume, but its MVCC design and O2N tuple ordering doesn’t lend itself as easily to high write workloads.

No returning clause, you’re correct - no way around that if it matters for your use case.

Similarly, yes, functional indices are quite nice if you know how and when to use them.

I’d love to see benchmarks comparing the two RDBMS, properly tuned, with the same workload. I’m OOO this week but I might do that in the future to see for myself.

As to SQLite, I get the appeal, and I get why it maintains backwards-compatibility so fiercely, but good lord some of its quirks are bad. FKs don’t do anything by default, PKs can have NULLs, column types are mere suggestions unless you enable strict mode…

> Similarly, yes, functional indices are quite nice if you know how and when to use them.

In fairness to MySQL, you can simulate this surprisingly well by adding indexes to computed columns.

> I’d love to see benchmarks comparing the two RDBMS, properly tuned, with the same workload. I’m OOO this week but I might do that in the future to see for myself.

This is where I see most existing database benchmarks falling down.

tl;dr: testing database performance across engines is a lot harder than most folks realize.

Imagine you have a DB where you're tracking classroom assignments for all public schools in the state. Here are some requirements:

* A classroom must have at least one teacher and at least one student per class scheduled.

* A classroom must not allow more students than its capacity.

* No teacher or student may be assigned to more than one classroom at the same day/time.

To solve this in MySQL, you MUST use application code to check for conflicts and will always be subject to race conditions at your data layer.

The check for existing slots must always be a separate query from the insert, and MySQL cannot restrict overlapping timestamp ranges at the data layer. This requires data custodial work to resolve.

Postgres on the other hand both supports a timestamp range type but also allows exclusion constraints to prevent the same person from being assigned to more than one class when those ranges overlap—a data layer restriction that makes app support code unnecessary and race conditions logically impossible.

Testing "the same workload" is difficult because the data schemas do not match, the application code calling it will not match, and if you did implement it, folks would cry about "apples and oranges" and how they're not the same workload.

In MSSQL, you might write a .NET component that lives in the database and handles the potential race condition. You also might use temporal tables in that engine to track changes over time, something that (again) would involve non-trivial code changes for MySQL and Postgres to match requirements.