Hacker News new | ask | show | jobs
by amake 476 days ago
It's not counting "characters"; it's counting UTF-16 code units, so for instance a country flag is 4 "characters" because it's two regional indicator symbols, each composed of two surrogates.

It should really be using `Intl.Segmenter` instead of `string.length`: https://stackoverflow.com/questions/10287887/get-grapheme-ch...

1 comments

I mean, if this uses the same calculation as Twitter, Instagram and other sites that limit characters, then it should probably stay as it is, I don't have a problem with that.

However, a quirk in the way Unicode handles emoji has meant that some of the symbols take up many more characters than others. For example, a flag can take up as many as 14 spaces in your Tweet. Twitter has announced that it's changing the way it counts emoji so that they're all counted equally, as two characters. - https://www.theverge.com/2018/10/11/17963230/twitter-emoji-c...

So 4 character flag, which is written as 1 character, is sometimes counted as 2 characters. Honestly, I don't know, I did not expect for this to go so deep.

> I mean, if this uses the same calculation as Twitter, Instagram and other sites that limit characters

Almost certainly those sites use much more sophisticated counting schemes.

> Honestly, I don't know, I did not expect for this to go so deep

You're not the creator of the tool in question, so it's OK for you not to know about this stuff, but I would really expect more from someone who is setting out to create a "character counter" tool.