Hacker News new | ask | show | jobs
by KingLancelot 1258 days ago
Ah, so Twitter’s Unicode implementation is fucked.
1 comments

Nothing is wrong with it.

If it counted UTF-16 code units that would be dumb. It doesn't. The cutoff was deliberately set to keep the 140 character limit for CJK but increase it to 280 for the rest. And they did that based on observational data.

https://cdn.cms-twdigitalassets.com/content/dam/blog-twitter...

https://blog.twitter.com/en_us/topics/product/2017/Giving-yo...