Hacker News new | ask | show | jobs
by zadikian 115 days ago
Seems like a frequent surprise is that Postgres and MySQL don't default to serializable (so not fully I in ACID). They do read-committed. I didn't see this article mention that, but maybe I missed it. The article says read-committed provides "slightly" better performance, but it's been way faster in my experience. Forget where, but I think they said they chose this default for that reason.

Using read-committed ofc means having to keep locking details in mind. Like, UNIQUE doesn't just guard against bad data entry, it can also be necessary for avoiding race conditions. But now that I know, I'd rather do that than take the serializable performance hit, and also have to retry xacts and deal with the other caveats at the bottom of https://www.postgresql.org/docs/current/transaction-iso.html

3 comments

Recent versions of MySQL and MariaDB default to repeatable-read for InnoDB tables, not read-commited :

https://dev.mysql.com/doc/refman/8.4/en/set-transaction.html...

https://mariadb.com/docs/server/reference/sql-statements/adm...

I don't know about MyISAM though (who uses it anyway ;-) ).

The issue with SERIALIZABLE, aside from performance, is that transactions can fail due to conflicts/deadlocks/timeouts, so application code must be prepared to recognize those cases and have a strategy to retry the transactions.
Right. So my code had a helper to run some inner func in a serializable xact, in rw or ro mode, which would retry with backoff. Like the TransactionRunner in Spanner. But even with no retries occurring, it was very slow.
VoltDB took this to an extreme - the way you interact with it is by sending it some code which does a mix of queries and logic, and it automatically retries the code as many times as necessary if there's a conflict. Because it all happens inside the DBMS, it's transparent and fast. I thought that was really clever.

I'm using the past tense here, but VoltDB is still going. I don't think it's as well-known as it deserves to be.

Interesting. How is that faster than just having the code running on the same machine as the DB? Guess it could be smarter about conflicts than random backoff.
> Postgres and MySQL don't default to serializable

Oracle and SQL Server also default to read committed, not serializable. Serializable looks good in text books but is rarely used in practice.

One reason Oracle uses it is because this mode scales horizontally whilst allowing very large transactions. You can just keep adding write masters.

The best implementation of serializable transactions I've seen is in FoundationDB but it comes with serious costs. Transactions are limited in size and duration to a point where many normal database operations are disallowed by the system and require app-layer workarounds (at which point, of course, you lose serializability). And in many cases you do need cluster locks for other purposes anyway.

Spanner has similar limitations on xact size, maybe for this reason?
Probably. I've seen it argued that TX size limits are a good practice anyway, and not having them is a design fault of SQL, but it's an argument on thin ice. Transaction size and scope is usually defined by the nature of the business logic, it's not something you can just define to be whatever you want without consequence. An RDBMS can do atomic and correct changes to an entire very large table without any developer effort. That might hang writes for a few minutes so depending on the nature of your application that might not be a feature you can get away with using, but if the table in question is updated by background workers and not on a latency sensitive path it can be a perfectly viable thing to do (on a good database engine, so not postgres mvcc).
I'll keep my xacts small until that one time we have to do some big manual fix or migration. But you don't even have to do anything that wild to hit the Spanner 100MB limit.
And 100MB is huge! FoundationDB limits transactions to 10MB.
Yeah, the only examples I know of it being default are Spanner and Cockroach, which are for a different use case.