|
|
|
|
|
by zzo38computer
2394 days ago
|
|
I think Unicode is terrible. Remove everything. Use ASCII and other character sets. Unicode is OK for searching for data using many different languages (if you omit much of the junk such as emoji and compatibility characters), although might not be best with that too. You can't effectively use one character set well for everything; different applications have different requirements. Unicode is equally bad for everything, rather than e.g. ASCII which is good for some stuff and not usable for some stuff, and other character set which is a similar thing. Many things you just can't do accurately with Unicode. |
|
In our application, our users gets data from systems around the world, and might have to change some of it before sending a file with the data to some official system. The data includes names of people and places. How would you do this using character sets?
One file might need to contain names with Cyrillic characters and with Norwegian characters. There's no character set with both. Should each string in the file have an attribute saying which character set the string is encoded in? What are the odds that people implementing that won't mess that up when oh so many can't even get a single encoding attribute right[1]?
Or, just maybe, strings in the file could be Unicode, encoded in say UTF-8, so that the handling of all of them are uniform...
[1]: https://www.w3.org/TR/xml/#charencoding