Hacker News new | ask | show | jobs
by est 5009 days ago
Those who fail to realize the importance of unicode support is the same group of people think handling strings with \0 terminator is OK.

But no, it's fucking not OK. Lacking basic unicode handling means everyone handles unicode strings in different ways, so various libraries clutter in basic string capabilities. Sometimes this really piss me off.

1 comments

Um, UTF-8, a valid unicode encoding, handles strings with a null terminator just fine. Those who throw stones, should not live in glass houses.

Proper unicode support would include things like normalizing strings (Unicode has 4 different normal forms!) and testing for fuzzy equivalence. But last time I checked, the usual library for doing those things, libicu, was bigger than the whole Lua interpreter. So I can see why there isn't a lot of enthusiasm, especially if there isn't a real use case.

yes, packing icu with every lua distribution or standalone app is bad, and we don't even need to think about OS's Unicode capabilities. So every lang deserves to be suck at unicode handling. Because, you know, ICU is big.