Hacker News new | ask | show | jobs
by mstade 3327 days ago
> JSON Feed files must be served using the same MIME type — application/json — that’s used whenever JSON is served.

So then it's JSON, and I'll treat it as any other JSON: a document that is either an object or an array, that can include other objects or arrays, as well as numbers and strings. Property names doesn't matter, nor do order of properties or array items, or whatever values are contained therein.

Please don't try to overload media types like this. Atom isn't served as `application/xml` precisely because it isn't XML; it's served as `application/atom+xml`. For a media type that is JSON-like but isn't JSON, you may wish to look at `application/hal+json`; incidentally there's also `application/hal+xml` for the XML variant.

Or as someone else rightly suggested, consider just using JSON-LD.

5 comments

It's worth pointing out that any valid JSON value is a valid JSON document. There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

"I am a valid JSON document. So is the Number below, and in fact every line below this line."

4

null

Actually actually... the JSON spec doesn't define the concept of a JSON document. Neither http://www.json.org/ nor http://www.ecma-international.org/publications/files/ECMA-ST... actually specifies that a JSON 'document' is synonymous with a JSON 'value'.

Now it's also true that JSON doesn't specify an entity that can be either an object or an array but not be a string or a bool or a number or null. So it's kind of true that JSON doesn't say that an object or array are valid root elements.

But JSON also says "JSON is built on two structures" - arrays and objects. It defines those two structures in terms of 'JSON values'. But it's a reasonable way to read the JSON spec to say that it defines a concept of a 'JSON structure' as an array or object - but not a plain value. And then to assume that a .json file contains a JSON 'structure'.

Basically... JSON's just not as well defined a standard as you might hope.

edit: And now I'm going to well actually myself: Turns out https://tools.ietf.org/html/rfc4627 defines a thing called a 'JSON text' which is an array or an object, and says that a 'JSON text' is what a JSON parser should be expected to parse.

So - pick a standard.

JSON is in fact defined in (at least) six different places, as described in the piece 'Parsing JSON is a Minefield' [1] (HN: [2]).

The problem is perhaps not as egregious as with "CSV" -- which is more of a "technique" rather than a format, despite after 30 years of customary usage, someone retroactively having written a spec; but it does manifest in various edge cases like we're debating.

[1] http://seriot.ch/parsing_json.php [2] https://news.ycombinator.com/item?id=12796556

Why are you referencing the obsolete rfc? There is no restriction to object/array for the JSON text in the current rfc https://tools.ietf.org/html/rfc7159
The current RFC recommends the use of an object or array for interoperability with the previous specification. JSON being a bit of a clusterf* of variants, they tried to make the RFC broad then place interoperability limitations on it. (lenient in what you accept, etc etc)
Because I just discovered that there was, at least once, a specification that actually defined JSON that way, where previously I had thought it had only been ambiguously described, and I thought that was interesting.
> There is no requirement or guarantee that an array or an object are the top-level value in a JSON document.

Alas, if only that were true.

RFC 4627:

> A JSON text is a serialized object or array. The MIME media type for JSON text is application/json.

RFC 7159:

> A JSON text is a serialized value. Note that certain previous specifications of JSON constrained a JSON text to be an object or an array. Implementations that generate only objects or arrays where a JSON text is called for will be interoperable in the sense that all implementations will accept these as conforming JSON texts.

IIRC, Ruby's JSON parser was written to be strictly RFC 4627 compliant, and yields a parser error for non-array non-object texts.

Since JSON isn't versioned so no one has any idea what "JSON" really means, or what "standard" is being followed.

You're right, thanks for the correction! Also kind of reinforces my point I feel. That any JSON document is just that, a JSON document; it doesn't carry more semantics just because you say so. My JSON parser will still just see simple JSON values, no matter how much I tell it that a certain key should really be a URL, not just a string.
True, but that's also true of any XML, RSS, Atom, HTML, etc. Websites abuse HTML all the time, and there's nothing saying that just because something is transferred with application/atom+xml that it will be valid or follow the spec.

It's more of a social agreement. If you get a JSON object from a place you expect a JSON Feed and it has a title and items, then it'll probably work, even if it omits other things.

So we can ditch media types altogether then? What's the point of having actual contracts if all we need is a hand shake and a wink? We're not talking about malformed data here, that's something different entirely and yes – it happens all the time. We're talking about calling a spade a spade.

If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work. Granted, it probably won't be a very interesting program precisely because JSON is just generic data without any meaningful semantics.

This spec is entirely about attaching semantics to JSON documents, but all that gets lost when you forget to let people know the document carries semantics and just call it generic JSON. Maybe that doesn't matter to a JSON-feed specific app that thinks any JSON is JSON-feed (an equally egregious error) but if there's an expectation that I should be able to point my catch-all program (i.e. web browser) at a URL and it should magically (more like heuristically I guess, potato/tomato) determine that the document retrieved isn't in fact just any JSON then things are about to get real muddy. Web browsers aren't particularly social, so I suspect a social agreement probably won't work that well.

Media types aren't just something that someone thought was a nifty idea back in the dizzy, they are pretty important to how the web functions.

If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work.

That's not a valid argument, because JSON is just a serialization format for an arbitrary data structure. You can't throw any arbitrary data structure at any program that accepts data and expect it to be able to accept it. Every program that accepts input requires that input to be in a specific format, which is nearly always more specific than the general syntax of the format. And aside from programs that make strict use of XML schemas, they pretty much all use the handshake-and-wink method for enforcing the contract. (Or to put it another way: documentation and input validation.)

My take on the author's approach is that the content-type is specifying the syntax of the expected input, and the documentation specifies the semantics and details of the data structure. In that respect, the program works like most other programs out there.

Aww that's not fair – if you're going to quote then don't cherry pick and remove the relevant bits.

> If it's JSON your program expects then I should be able to throw any valid JSON at your program and it should work. Granted, it probably won't be a very interesting program precisely because JSON is just generic data without any meaningful semantics.

(Emphasis mine.)

By doing this you're just reinforcing my argument that just parsing any ol' plain JSON won't make for very interesting programs. JSON is just plain dumb data, it doesn't tell you anything interesting. There may be additional semantics you can glean from a document than just its format (HTML is pretty good for this, but oddly enough not a very popular API format) if there are mechanisms to describe data in richer terms – but JSON has none of these. Yet this spec says you should serve this as just plain ol' boring JSON.

> And aside from programs that make strict use of XML schemas, they pretty much all use the handshake-and-wink method for enforcing the contract.

This is just not true. Case in point: web browsers – arguably one of the most successful kind of program there ever was, with daily active users measuring in the billions – make heavy use of meta data including media types to determine how to interpret the input. Not just by way of format (i.e. media type) but also by way of supplemental semantics (e.g. markup, micro formats, links.)

> My take on the author's approach is that the content-type is specifying the syntax of the expected input, and the documentation specifies the semantics and details of the data structure.

Which could and should be described in a spec, with a corresponding IANA consideration to include a new and specific media type in the appropriate registry – not by overloading an existing one.

I'm not sure what you're arguing. JSONFeed is JSON, unless I'm missing something, just JSON that matches a specific schema.

If I'm pulling JSON from any API, I expect it to match a certain schema. If I expected { "time": 10121} from a web API they send me "4", then sure, that's valid JSON, but it doesn't match the schema they promised me in the API.

Something that's JSON should be marked JSON, even if we're expecting it to follow a schema.

> JSONFeed is JSON, unless I'm missing something, just JSON that matches a specific schema.

Yes, and everything is application/octet-stream, so why have mime types? Because it helps with tooling, discovery, and content negotiation. It is a hint for the poor soul who inherits some undocumented ruby soup calling your endpoint.

Being as specific as possible with mine types is a convention for a reason. Please don't break it unless you have an explicit reason to.

This is exactly one of the things that media-types solve. Simply using application/json doesn't tell me (consumer) anything about the semantics of what I'm reading. It only tells me what "parser" to use. If we have a proper media-type, like application/hal+json, I know exactly how to create a client for that type: I need to use a JSON parser _and_ use the vocabulary defined by HAL…
> Something that's JSON should be marked JSON, even if we're expecting it to follow a schema.

That's what the +json type suffix is for. I wonder how many people in this thread actually have read the mediatype RFCs, because they definitely don't encourage using mediatypes in the way you're describing.

The whole point of mediatypes is to make it possible to distinguish schemas while also potentially describing the format that the schema is in.

this tool may help to validate and format JSON data, https://jsonformatter.org
Beware that many JSON parsers don't agree with this, although your interpretation is the correct interpretation of the spec. Some parsers will only accept either an array or object. If you're building a JSON endpoint you'll be safest returning either an array or object.
true
false
Absolutely agree about the MIME type.

Someone filed an issue and created a pull-request for this after you wrote this comment.

https://github.com/brentsimmons/JSONFeed/issues/22

https://github.com/brentsimmons/JSONFeed/pull/23

I hope they will merge it.

That's great, thanks for sharing!
Should this web page have been served as `text/hacker-news-comment-thread+html`?
No. HTML is not formally recognized to be a 'Structured Syntax' of upon which semantically richer standalone mediatypes can be built [1]. This is because existing deployments favor a different approach of imbuing additional semantics inside HTML documents -- microformats -- which place the mechanism of understanding on an opportunistic parser, vs. a restrained one that only executes if its preferred mediatype is advertised. Appendix A of RFC 3023 [2] offers a thorough treatment of this matter. Not defining +html is essentially a concession that enables the two schools of thought to coexist side-by-side.

This is the same difference in schools that I express in a different comment [3] in this thread.

[1] https://tools.ietf.org/html/rfc6839 [2] https://tools.ietf.org/html/rfc3023#appendix-A [3] https://news.ycombinator.com/item?id=14361842

No, because this web page isn't using a specialized format.

But if it were using XHTML, then the proper mime type would be application/xhtml+xml.

It seems that all this spec is, is a structure for an api response. I don't see why it should have a different media type.
I don't believe it should be just application/json because it's a specific format of json. There could be multiple json representations of the feed other than jsonfeed that the server supports and the client could define which ones they Accept.

So the server could support all of the following:

application/jsonfeed

application/rss+xml

application/atom+xml

Who knows, maybe RSS and ATOM could be represented in JSON and have the following mime types:

application/rss+json

application/atom+json

If it's just an API response, and it is your API for an application called Widget Factory, then you can, if you want, have your own format:

application/vnd.widgetfactory+json

Generally, defining such a mime type should have some specification describing it otherwise no client can reliably implement a compatible client. Jsonfeed have proposed that specification.

Well, if JSON had namespaces or standard validation framework, we could have that conversation.
You mean, like JSON schema?

http://json-schema.org/

Not sure why you want to emulate XML namespaces in JSON, but JSON schemas can include other JSON schemas and extend upon other JSON schemas. That accounts for 99.9% of the use cases for namespaces.

That's my point though – it doesn't have anything to describe metadata, so therefor trying to cram in additional semantics is futile if you still want to call it JSON. Call it something else and you can attach whatever semantics you'd like, but they think it should be served up as `application/json` which means all those semantics go out the window.