Hacker News new | ask | show | jobs
by belluchan 4535 days ago
And software developers, don't forget to implement the 4 byte characters too please. Utter nightmare dealing with MySQL. I believe 4 byte characters still even break github comments.
3 comments

MySQL was a Norwegian company at the beginning, but you won't believe. It's one of the worst products when it comes to I18N and especially to Unicode. They still kind of store unicode text as a chain of encoded utf8 characters, if I'm not wrong. Their stupid command line utility still defaults to LATIN1 for input and output despite my locale clearly saying .UTF-8. People in their IRC channel still refuse to admit anything of this as a problem.
The horrible unicode support in MySQL fits well with its general careless attitude toward data integrity.

Minor correction: MySQL was Swedish and InnoDB Finnish.

About 2 weeks I tried to file a bug with our backend guys about 4-byte characters wreaking havoc on our API.

My example broke the bug tracker's (bugzilla) comment system as well. I chuckled.

Yeah. We've noticed, to our own amusement, that Jira (we're on an older version) can't handle non-ASCII. Makes entering tickets involving other languages fun.
I assume Bugzilla broke due to being backed by MySQL. Bugzilla itself is written in Perl so should have no problem.
At my last job I checked in a test case for our astral character handling. And broke the build server.
I ran into that issue with MySQL's utf-8 handling. It was đť’śwesome: http://geoff.greer.fm/2012/08/12/character-encoding-bugs-are...