|
|
|
|
|
by sirn
2541 days ago
|
|
I can only speak from experience, but Thai language uses 3 bytes per character in UTF-8 and rely on vowels and tone marks to compose a word, so the number of bytes can grow pretty quick. A headline of an article in Thai have a good chance of exceeding 510 bytes. This is why one of the major Thai IRC networks were stuck with TIS-620 for a long time (ThaiNet/irc.thai.com, though I'm not sure if this is still the case) which is 8-bit compatible with ASCII (uses 0xA1 to 0xFB for Thai characters). |
|