| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by model-15-DAV 556 days ago
	JSON as a data-format should not have comments. JSON as a file-format should allow comments. The problem is this conflation between the two.

8 comments

lolinder 555 days ago

But they're not two different formats—they're two different jobs being done by the same format.

JSON as currently spec'd is honestly quite bad at both jobs, but the most rational defense of its use as a data format is that it's (mostly) human readable. Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?

throw0101d 555 days ago

> Given that that's its main value proposition, what exactly is the reason for saying that JSON-as-data-format should not have comments? What do we lose if we allow them?

Because JSON originally did have comments, and people were putting pragmas into them, and so different parsers would act different depending on whether they understood them or not. Comments ended up being an anti-feature in JSON because people were abusing them.

Source:

> I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't. […]

* https://web.archive.org/web/20190112173904/https://plus.goog...

Cpoll 555 days ago

I don't buy it, what's stopping people from putting pragmas in key:value pairs? There's a chance of collision, but you're already deciding to sacrifice interoperability, so just accept that the myJson spec says '___declare___' is a reserved key.

jakeogh 555 days ago

If I parse json, I dont want to lose data. Having the parser read the comments (however they are, as long as they are in spec and therefore read by the parser) is a good thing. Having to parse the file again, with a fuzzy out-of-spec system (looking for comments) is clearly worse. The whole point of json is to serialize stuff, breaking that to insert non-machine readable comments makes the spec less reliable.

staz 555 days ago

You cut the rest of his comment

> Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.

Doesn't seems he is that against the idea

tom_ 555 days ago

What are some examples of people doing this?

m463 555 days ago

> but the most rational defense of its use as a data format is that it's (mostly) human readable

I would call out portability instead, which is not dependent on the byte ordering or endianness issues of binary data formats.

sort of like: javascript is portable code, json is portable data.

sophacles 555 days ago

I think json should allow comments.

But there are dangers there - look at how horribly comments get abused in code:

* doctests are nonsense, just write tests. (doctests like rusts that just validate example snippets are the closest thing to good I've seen so far, but still make me nervous).

* load bearing comments that code mangling/generation tools rely on (see a whole bunch of generated scripts in your linux systen - DO NOT EDIT BELOW THIS LINE)

* things like modelines in editors that affect how programs interact with the code

* things like html or xml comments that on parsing affect end user program logic.

Comments can be abused, and in something like JSON on the wire I can see systems which take additional info from the comments as part of the primary data input. Often a completely different format... and you end up with something like the front-matter on your markdown files as found in static site generators.

Point being, comments are not a purely benign addition.

mschuster91 555 days ago

> see a whole bunch of generated scripts in your linux systen - DO NOT EDIT BELOW THIS LINE

these are mostly a warning sign for humans, to be read as "if you need to modify the script below this line, a) you gotta be knowing what you're doing, we are not held liable for support if you change stuff around there b) please contact us to make sure we didn't miss a legitimate need or c) you're trying to do something in a bad way and there's better ways to do so".

Cpoll 555 days ago

I feel like most of your examples of "abuse" are just getting things done.

I don't see anything intrinsically wrong with doctests. I also can't see a better way to do "load bearing comments," and I'm not eager to go back to "Step 2: Edit your .bashrc to include foo."

pulsarmx 555 days ago

How is abusing comments any different than abusing a top-level property with a key like “__comment”?

amatecha 555 days ago

at least a top-level metadata property can be explicitly defined in a .json.schema[0] and formalized, rather than being some kind of ad-hoc pre-processor step you have to evaluate before actually using the JSON data. I didn't even know about that approach before I read your comment but it instantly makes more sense to me in terms of maintainability and interoperability.

[0] https://json-schema.org/

michaelmior 555 days ago

> doctests are nonsense, just write tests

Why are doctests not tests?

> doctests like rusts that just validate example snippets are the closest thing to good I've seen so far

Rust's doctests don't seem to be fundamentally different from Python doctests, which is the language I've seen most commonly make use of doctests.

Kwpolska 555 days ago

The problem is using JSON as a file format in the first place. It’s not designed for humans to edit. (Then again, it’s better than the Norway-sceptic YAML.)

peterashford 555 days ago

I disagree. At least in an ought vs is sense: it's entirely the kind of format that I would create as an editable format. As witnessed by the fact that my workmates and I did create very nearly JSON previously as a file format in the 90s (but for C code programs)

Kwpolska 555 days ago

For a very narrow target audience (programmers), JSON is fine. If you want the file to be edited by anyone else, JSON is pain.

Jare 555 days ago

What example(s) of file format would you say are designed for humans to edit and still represent the kind of structured contents that json does?

thayne 555 days ago

TOML, extensions of json like json5 and hjson, a bunch of lesser known formats for nested structures like NestedText, UCL, kdl, Eno,sdlang, eldf, etc.

Also languages with some progrommatic capabilities like cue, dhall, jsonnet, nickel etc.

Non of them are perfect, and some are less suitable for certain use cases than others. But IMO pretty much all of them are better for human editing than json, and in many cases yaml.

codedokode 555 days ago

Any format that:

- doesn't require to quote everything

- has lists/dictionaries

- uses indentation and new lines instead of commas and brackets

- doesn't have 1000 unnecessary features like YAML

Also, you don't need all types from JSON.

michaelmior 555 days ago

> you don't need all types from JSON

JSON has a very minimal set of types and I regularly use all of them. I guess you could argue that integers and numbers could be combined, but I think that's it.

Kwpolska 555 days ago

I can’t think of anything that is not painful in some way.

tedunangst 555 days ago

JSON with trailing comma support.

Cthulhu_ 555 days ago

But it happens. 'npm install' will edit your json file, but so can I.

That said, I don't like it as a config file read/written by humans.

martin-adams 556 days ago

Can I confirm that the reason it's not preferred to have comments in data-formats is because it's to be machine read only and as such should be as efficient as possible and not contain information that wont be used?

Seeing as I can only see the use case as a file format to be read/written by humans in the loop, then maybe the conversation should be about compiling the file format to a data format for compatibility outside of the user tooling.

johannes1234321 555 days ago

The argument is that comments are often used as an escape hatch from specified formats to carry further instructions. So you got a properly specified format and then want to do vendor&extensions but not break other implementations ... just make your extensions a comment. Then other parsers ignore it and you can do your thing.

The idea is that this forces better formats.

How well this works? Well, then I got an "x-comment" property or non-standard comments. Nonetheless. If people see the need to hack some extension in, they'll find a way.

ur-whale 555 days ago

> is because it's to be machine read only

Why did they bother making it text-only ASCII then ?

hinkley 555 days ago

JSON wins because it can be casually inspected by people testing bizarre theories. The importance of this is lost on people who don’t treat triage as a skill that can be honed.

I like to solve problems - or at least bringing them to me doesn’t result in a loss of status for either party. People notice this about me and bring me problems. Someone recently described to people what is essentially my process: the likelihood of the cause divided by the difficulty of verification. Partially sort and just start checking off assumptions.

A lot of cheap but low probability options get shuffled higher, and just sending the wrong data is a common enough problem, especially with caching. And if it’s nearly free to look at the payload, it’ll get checked. If it isn’t people will try everything else to avoid it.

DaiPlusPlus 555 days ago

> ASCII

JSON is notable for making UTF-8 encoding a hard requirement.

…which was pretty ballsy back in the mid-2000s. We were still fighting with Shift-JIS and Windows-1252. Excel didn’t add proper support for UTF-8 until depressingly recently.

hinkley 555 days ago

Late 90’s I had to fix bugs in a shiftJIS implementation. And I couldn’t read a lick of Japanese. Still can’t.

I don’t remember when I started pushing for utf-8 everywhere but it was “early” by most people’s standards, so I know what you mean.

And one of the things that makes me dislike MySQL is that they have a field type called utf-8 that isn’t. And they didn’t fix it, they introduced a new type instead. So that footgun was still there for all to trigger. So mad.

joemi 555 days ago

Pretty sure they meant plaintext instead of ASCII.

foldr 555 days ago

JSON does not require UTF-8 encoding.

DaiPlusPlus 555 days ago

> JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8

https://datatracker.ietf.org/doc/html/rfc8259

foldr 555 days ago

Ah ok, fair enough. This is a more recent (2017) clarification of the standard which I hadn't seen. The original mid 2000s specification did not require UTF-8.

> Previous specifications of JSON have not required the use of UTF-8 when transmitting JSON text. However, the vast majority of JSON-based software implementations have chosen to use the UTF-8 encoding, to the extent that it is the only encoding that achieves interoperability.

burnished 555 days ago

I think in the JSON case its because you can't have true comments, any comments are intrinsically part of the data structure, and you invite problems by including irrelevant information

nivertech 555 days ago

Thinking from the first principles:

1. comments are metadata (specifically Human/LLM-readable metadata vs machine-readable metadata)

2. general-purpose data formats should support metadata

jimmaswell 555 days ago

Disallow comments and now you just have {"comment": "the quick brown fox.."}, the worst of both worlds.

hombre_fatal 555 days ago

That's a harmless example and a tiny price to pay.

What no-comments saved us from was stuff like this in our data interchange:

    {  
        "count": 123 // bigint
        "price": 10.99 // @precision=2
        "date": "2024-08-12" // @format=YY-MM-dd
        "data": /* !transform(rot13) */ "uryyb" 
        "storage": 5 // Unit(TB)
    }

And who knows what deeper layers of hell we avoided.

Frankly, VSCode shows that all this time people were complaining about no comments in JSON config and how hard it was to write config in JSON, they could have just written their apps to strip comments at read time.

So we do have the best of both worlds.

bearjaws 555 days ago

Your example is perfect, I'm stealing this for the next time JSON comments comes up.

codedokode 555 days ago

> JSON as a file-format should allow comments.

JSON is awful for writing manually because it requires typing too many quotes, commas etc. I think JSON is meant to be machine-generated and machine-read and therefore doesn't need any comments.

leptons 555 days ago

You're a programmer and you're against writing quotes and commas? You must really hate coding. I've never found JSON to be too much typing.

xnorswap 555 days ago

If you're entirely machine writing and reading, but still want to be human-legible, then XML does a much better job while also allowing for schema.

pjc50 555 days ago

The only reason it became popular is that conflation!

Someone 555 days ago

I think it’s more “it would be nice if JSON intended to be read or written by humans allowed comments”.