Hacker News new | ask | show | jobs
by jcampbell1 4541 days ago
I agree with most of your bashing. PHP is a fucking mess, and non OO strings is a hangover vomit from C.

That being said, when there is a problem, there actually is a solution. PCRE actually works. Javascript, for instance, has no collation support at all as far as I can tell.

Personally, most of the problems with UTF-8 are mixed content issues.

The best fix is not the Python route, but rather just deprecating a bunch of stuff such as utf8_encode/decode. Throw a warning when any database connection is not utf8. Throw a warning when the OS is not setup to return UTF-8. It is more important that people run php in a end-to-end utf-8 environment, than changing the internals. Once people have a good environment, they will stop talking about strlen/strpos which are really not much of a problem. Maybe they should be renamed bytelen/bytepos, but php has too many of that type of problem to count.

99% of the UTF-8 problems don't exist if everything is UTF-8. Counting unicode code points vs bytes is not the real problem. The real problem is bullshit like 'SET NAMES utf8' / setlocale('LC_ALL','en_US.utf-8')

BTW, what language do you think gets this stuff right? Go looks promising, but it is brand new. I have problems with pretty much every language I know well: javascript/python/Objective-C/PHP