Hacker News new | ask | show | jobs
by jcranmer 3387 days ago
You could just read PropLists.txt to find the list of characters with the Deprecated property:

    0149          ; Deprecated # L&       LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
    0673          ; Deprecated # Lo       ARABIC LETTER ALEF WITH WAVY HAMZA BELOW
    0F77          ; Deprecated # Mn       TIBETAN VOWEL SIGN VOCALIC RR
    0F79          ; Deprecated # Mn       TIBETAN VOWEL SIGN VOCALIC LL
    17A3..17A4    ; Deprecated # Lo   [2] KHMER INDEPENDENT VOWEL QAQ..KHMER INDEPENDENT VOWEL QAA
    206A..206F    ; Deprecated # Cf   [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES
    2329          ; Deprecated # Ps       LEFT-POINTING ANGLE BRACKET
    232A          ; Deprecated # Pe       RIGHT-POINTING ANGLE BRACKET
    E0001         ; Deprecated # Cf       LANGUAGE TAG
(note that the ruby annotation codepoints aren't on that list).

The use in XML/HTML is no longer maintained by Unicode, it is maintained by the W3C instead: https://www.w3.org/TR/unicode-xml/.