Hacker News new | ask | show | jobs
by morgo 2540 days ago
Hi! Former product manager for MySQL here. The defaults have changed a lot across major releases:

https://mysqlserverteam.com/new-defaults-in-mysql-8-0/

https://dev.mysql.com/doc/refman/5.7/en/added-deprecated-rem...

https://dev.mysql.com/doc/refman/5.6/en/server-default-chang...

One detail that is not always obvious is how much work goes into limiting regressions. The work to switch to utf8mb4 really started in MySQL 5.6 by not allocating the sort buffer in full (and then further improved in 5.7). 8.0 then added a new temptable storage engine for variable length temp tables.

These are not small cases either: When you compare to latin1 because the _profile_ of queries could change from all in memory to on disk, we could be talking about 10x regressions. In MySQL 8.0 it is more like 11% https://www.percona.com/blog/2019/02/27/charset-and-collatio...

Edit: Also forgot to mention, switching the default character set broke over 600 tests. It's not as easy as it sounds!

1 comments

While I appreciate that it's the default now (utf8mb4)... If someone specified (by error) "utf8" as the collation, is that real utf8 or some other implementation currently?
If someone uses `utf8` in MySQL 8.0, they will get a warning suggesting they should use `utf8mb4`, because `utf8` will be deprecated.

Redefining `utf8` to mean 4-byte would break the upgrade since existing tables would not be able to join against newly created tables.

This is discussed here: https://mysqlserverteam.com/sushi-beer-an-introduction-of-ut...