Hacker News new | ask | show | jobs
by taeric 1169 days ago
> Turned out JavaScript was the first language to give us lambdas, and that was an amazing breakthrough.

I mean... with charity I can see the context and get it. But. What!?

Overall fun read through history, even if definitely from Doug's perspective only. (As evidence by JavaScript being an originator of lambdas...) I do find the idea that JSON was as novel as history says it was kind of odd. I remember inlining javascript objects years before "JSON" was a thing. Making it a subset of what javascript could already do seems straight forward and a good execution. Getting rid of comments feels asinine to me. (I'll also note that the plethora of behaviors you get from JSON parsers shows that it is effectively CSV. Sure, there may be a "standard" out there, but by and large it is a duck typed one.)

I'm also a bit on the camp that XML is better than JSON. Being able to have better datatypes, for a start. Schemas that allow autocompletion. Is also easier to see as a markup language (per the name). That said, they clearly went too far with entities and despite making sense for markup, attributes versus children are more than a touch awkward.

I also recall that what killed XML and WSDL files in general, was the complete shit show that was getting a single document to work with both MS and non-MS clients.

2 comments

Crockford mentions Scheme right before that, so he's aware lambdas originated with Lisp, presumably. I guess he means JS was the first mainstream language to popularize them?
Yeah, that is why I think I can see the point with charity to the discussion. Still an awkward proclamation. Many people were coding with LISPs for a long time before javascript came onto the scene. And I don't think LISPs were the only language with lambdas?
The current XML standard is hot garbage since it completely disallows null characters even via "�" despite most languages now supporting nulls in the middle of strings. Also, JSON definitely allows schemas, primarily through the JSON schema standard, but I've also seen TypeScript notation used for this as well which has the convenience of being readable by more people (I strongly suspect more people know TypeScript than know either XML schemas or JSON schemas combined).
JSON is garbage to read largely due to how much needs escaping. This is largely fine for smaller documents, but there is a reason yaml and toml both gained traction over raw json for config files.

And I don't make any real defense of some of the darker corners of XML. In particular, I already criticized entities being a bit too much. Namespaces are also something that, while I can see the desire, the implementation is way too much for most of us.

JSON schema is going to be cursed for a long time. Just the odd treatment of it will be a problem. (In particular, that it is a subset of the numbers that javascript itself supports is... awkward.)

I also confess, though; that I'm not clear why I would want a null in the middle of a string? That feels like a gun loaded and aimed squarely at a foot.

Most languages (C#, Java, Rust, JavaScript, etc.) support nulls in the middle of strings so it can be a security vulnerability if you try to serialize untrusted input to XML. I'd much rather be able to encode anything my input language considers a string and deal with excessive escaping than need to worry about what I'm going to do with inputs that my serialization language cannot support.
I'm curious what the vulnerability is? Also not clear what the null character is. Any links I can follow?

And again, if this is your line in the sand, how do you serialize NaN and Infinity in JSON?

Edit: Playing with this a bit, I'd actually assume that allowing \0 would be a vulnerability. I was curious how browsers treat it, so I see that parsing to an html document seems to just drop the characters? Fun little rabbit hole to jump in!

Yeah, that's why I consider it to be a breeding ground for vulnerabilities. People will probably just assume the XML serializer can handle any strings in their language of choice and not handle those edge cases. What I ended up doing for my use case was to encode nulls as "&#0;" but within a CDATA section so it was interpreted literally (choosing ambiguity over omission). The best way would probably be to have some sort of spell <null /> element, but there isn't such a thing within the standard. There asi:nil, but that is really indicating something else.
But what is the vulnerability? And what is a null character doing in a text document?

If you are just worried about data loss, having null allowed in text segments is already begging for failure, as C programs will almost certainly get them wrong.

If you are transferring binary, base64 or similar will already cover you.

And again, if this is a strike on xml, how do you represent NaN in a JSON document? Do what DynamoDB does and wrap all numbers in quotes?