Hacker News new | ask | show | jobs
by ur-whale 557 days ago
> is because it's to be machine read only

Why did they bother making it text-only ASCII then ?

2 comments

JSON wins because it can be casually inspected by people testing bizarre theories. The importance of this is lost on people who don’t treat triage as a skill that can be honed.

I like to solve problems - or at least bringing them to me doesn’t result in a loss of status for either party. People notice this about me and bring me problems. Someone recently described to people what is essentially my process: the likelihood of the cause divided by the difficulty of verification. Partially sort and just start checking off assumptions.

A lot of cheap but low probability options get shuffled higher, and just sending the wrong data is a common enough problem, especially with caching. And if it’s nearly free to look at the payload, it’ll get checked. If it isn’t people will try everything else to avoid it.

> ASCII

JSON is notable for making UTF-8 encoding a hard requirement.

…which was pretty ballsy back in the mid-2000s. We were still fighting with Shift-JIS and Windows-1252. Excel didn’t add proper support for UTF-8 until depressingly recently.

Late 90’s I had to fix bugs in a shiftJIS implementation. And I couldn’t read a lick of Japanese. Still can’t.

I don’t remember when I started pushing for utf-8 everywhere but it was “early” by most people’s standards, so I know what you mean.

And one of the things that makes me dislike MySQL is that they have a field type called utf-8 that isn’t. And they didn’t fix it, they introduced a new type instead. So that footgun was still there for all to trigger. So mad.

Pretty sure they meant plaintext instead of ASCII.
JSON does not require UTF-8 encoding.
> JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8

https://datatracker.ietf.org/doc/html/rfc8259

Ah ok, fair enough. This is a more recent (2017) clarification of the standard which I hadn't seen. The original mid 2000s specification did not require UTF-8.

> Previous specifications of JSON have not required the use of UTF-8 when transmitting JSON text. However, the vast majority of JSON-based software implementations have chosen to use the UTF-8 encoding, to the extent that it is the only encoding that achieves interoperability.

The original spec did require that all JSON decoders support UTF-8, though.
Hmm, not as I read it. It says that UTF-8 is the 'default' encoding. In context that just means that it's the encoding you assume if the first four octets don't match a pattern characteristic of one of the other encodings (when restricted to ASCII characters). See section 3 of https://datatracker.ietf.org/doc/html/rfc4627. The original RFC is vague, but I think the idea is that a fully conformant implementation would support all the encodings mentioned.