| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tialaramex 1613 days ago

For the string encoding thing, however, it does have UTF-8 and you should not use anything else to express arbitrary human text anyway.

PKIX actually leverages the weird encoding restriction to our benefit. It defines two kinds of names which things might have on the Internet (you can and should stop trying to name things which are actually on the Internet some other way), DnsNames and IpAddresses. IpAddresses, since they're either 32-bit or 128-bit arbitrary bit values, are just represented as either 32-bit or 128-bit arbitrary bit values. So you cannot express the erroneous IPv4 address 100.200.300.400 as an IpAddress, which means you can't trip up somebody's parser with that nonsense address. DnsNames use a deliberately sub-ASCII encoding from ASN.1 which can express all the legal DNS names (all A-labels and the ASCII dot . are permissible) but can't express lots of other goofy things including most Unicode. So a certificate issuer, even if they're completely incompetent, cannot write a valid DnsName that expresses some garbage IDN as Unicode. Hopefully they read the documentation and find out they need to use A-labels (Punycode) but if not they're prevented from emitting some ambiguous gibberish.

Even in forums where you'd once have expected pushback, "Just use UTF-8" is becoming more widespread. Microsoft for example, once upon a time you'd get at least some token resistance, today they're likely to agree "Just use UTF-8". So ASN.1 ends up no worse off for a half a dozen bad ways to write text you shouldn't use, compared to say XML, HTML, and so on.

1 comments

xyzzy123 1613 days ago

Agree, although the right thing to do helps in specific applications but not so much in the general case. You're very often stuck with other people's MIBs / specs and encoders, trying to make sense of what a) they're allowed to put on the wire and b) what they actually do and under what circumstances.

link