Hacker News new | ask | show | jobs
by acz 3650 days ago
Let’s start working on "SVG over UTF" RFC, should we?
1 comments

Honestly, I think "SVG over UTF" makes a lot more sense. It's impossible to make a character set that supports every character known to man, because that just adds undue effort on every computer maker, ect, to keep up.

So why don't we pick a very good set: perhaps every letter in every language in common use for the past 200 years? Then, for the oddball symbols that someone wants to mix in text, there can be some kind of SVG-like convention. This allows publishing textual information without requiring that every device maker updates their device to support a 1-off symbol.

> This allows publishing textual information without requiring that every device maker updates their device to support a 1-off symbol.

The main purpose of Unicode is to encode the information. How the information is turned into its visual counterpart is outside the scope of unicode. For what it's worth this could be done by linking unicode code points to matching SVGs in a document. Wait, exactly that is already a W3C standard: https://www.w3.org/TR/SVG/fonts.html

Because it's easier to throw in random icons than to actually accomplish the goal of "every letter in every language in common use for the past 200 years", or even "past 20 years".

Or, put another way:

'We have an unambiguous, cross-platform way to represent “PILE OF POO” (), while we’re still debating which of the 1.2 billion native Chinese speakers deserve to spell their own names correctly.'

https://modelviewculture.com/pieces/i-can-text-you-a-pile-of...

This is a link by the article's author that is intended to make it easier for us to add useful symbols: https://github.com/jloughry/Unicode I recommend you use it to add any glyphs that you feel are being neglected.
That article raises an interesting issue about a character in the author's name that is missing from Unicode. Unfortunately the article is (how to put this?) not constructive. The complex reasons that Unicode excluded the character are described in [1]. If the author addresses those issues, there's a much better chance to get the desired character into Unicode.

[1] http://www.unicode.org/L2/L2004/04252-khanda-ta-review.pdf

Correct me if I'm wrong, but isn't the Han Unification project more about unifying semantically distinct, but visually identical characters under the same codepoint (rather than grouping together similar-looking codepoints as the article suggests)? As far as I'm aware it's more along the lines of reusing the codepoint for 'a' when encoding both English and Spanish text. Am I mistaken in thinking this?
But if the shape of embedded in the text, font choice becomes meaningless.

> undue effort on every computer maker, ect, to keep up.

The effort to update the font files every few years? Unless you insist on supporting a new Unicode version the second it comes out, I don't see the big effort here? Of course there is effort for font makers, but this is quite centralised.

What about the oddest oddballs whose "symbols" are animations http://www.reactiongifs.com/r/tww.gif? They are used a lot on reddit sometimes even with sound.